SegNet 논문 간단 리뷰 + segresnet

학부인턴/논문리뷰

SegNet 논문 간단 리뷰 + segresnet

망삼드 2022. 7. 18. 19:48

최근 맡은 모델이 사용하는 모델이 SegResNet이고 segnet기반이여서 segnet: a deep convolutional encoder- decoder architecture for image segmentation을 읽게 됐다. *Vajay Bardrinarayanan, Alex Kendal,, Roberto cipolla, ..등 참여

architecuture부분이 중요해서 중점적으로 읽었다.

encoder와 decoder가 상응하는 구조이다. 마지막 pixelwise classification layer가 존재한다
encoder의 13개의 conv layer → classification을 위해 (VGG16 network와 동일) ⇒ 상응구조라 decoder로 13개의 conv layer가 존재한다
d완전 연결 계층을 사용하지 않아 가장 깊은 encoder output의 해상도 feature map 높일 수 있었다. 이로 segnet encoder의 parameter 개수도 줄였다.

decoder

13개의 conv, final decoder output은 multi-class soft-max classifier로 class probabilities를 각 픽셀마다 출력한다.

각 encoder는 filter bank를 사용한 convolution을 사용해 피처맵을 만든다. batch는 [51],[52]로 normalized하고 각 요소마다 relu max(0,x)가 적용된다. 맥스풀링은 2x2 window와 stride 2를 이용해 non-overlapping window를 사용한다.

maxpooling → 이미지의 작은 공간움직임에 대해 translation invariance 위해 사용됨
sub-sampling→ 각 pixel 에 대한 큰 input image context(spaital window)

이를 사용하면 공간 불변동성을 이룰 수 있지만 각 공간해상도에 대한 loss는 존재한다. → increasingly lossy(boundary detail) image representation은 세그멘테이션에도 좋지 않다. → sub sampling 전에 boudary정보를 encoder feature map에 capture 및 store할 필요가 있다.

해결을 위해

2비트의 2x2 pooling window사용
decoder upsample할때 memorized max-pooling indice사용

segresnet

https://docs.monai.io/en/stable/apps.html

Applications — MONAI 0.9.0 Documentation

input_images (Union[List[Tensor], Tensor]) – The input to the model is expected to be a list of tensors, each of shape (C, H, W) or (C, H, W, D), one for each image, and should be in 0-1 range. Different images can have different sizes. Or it can also be

docs.monai.io

monai architecture을 참고했다.

class monai.networks.nets.SegResNet(spatial_dims=3, init_filters=8, in_channels=1, out_channels=2, dropout_prob=None, act=('RELU', {'inplace': True}), norm=('GROUP', {'num_groups': 8}), norm_name='', num_groups=8, use_conv_final=True, blocks_down=(1, 2, 2, 4), blocks_up=(1, 1, 1), upsample_mode=UpsampleMode.NONTRAINABLE)

autoencoder regularization을 사용하는 3D MRI 뇌 종양 segmentation을 기반으로 한 모델이다. 2D, 3D 입력을 지하고 VAE를 사용할 수 없다. segresnet은 vae를 제외시킨 버전이기 때문이라고 한다

*VAE: 오토인코더를 확률모델로 확장해 추정된 확률분포에서 새로운 데이터를 생성할수 있다. 확률모델로 확장한 오토인코더를 VAE라고 한다.

*down sample, upsampleing

작은 영상을 키우는 것을 업샘플링, 큰 영상을 작게하는 것을 다운 샘플링이라고 한다.

< 학부인턴을 하며..

분명 공부를 하고 몇번이나 읽은 것인데 키워드만 생각나고 교수님이 여쭤보시면 맨날 어버버한다.. 복습은 필수고 내가 하는 것에 대한 정확한 이해가 필요하다고 몇번씩 되새기게 되는 시간이다. 서버이슈가 대부분이고 다양한 오류때문에 내가 정말 많이 부족하구나 라는 생각이 들었는데 이것도 벌써 3주차에 다가서고 모델을 돌리기 시작해 나름의 결과를 내고 있다 물론 노트북으로 하는거라 한계가 있고 epoch도 중간에 끊겨 당황한적도 많지만.. 사실 몇번이나 답답해서 그만두고 싶다고 생각도 했지만 혼자 집에서 노트북 두들기고 있는 것보단 훨씬 많이 배울 수 있는 기회라고 생각한다.

저작자표시 비영리 변경금지 (새창열림)

'학부인턴 > 논문리뷰' 카테고리의 다른 글

MRI-visible perivascular spaces in the neonatal brain 논문 리뷰 (0)	2023.01.21
(의료영상)프로젝트 result를 report하는 다양한 방법 (0)	2022.09.19
<Applications of artificial intelligence in Mammography from a Development and validation perspective: 유방촬영술에서 인공지능 적용:알고리즘 개발 및 평가관점 - 리뷰> (0)	2022.09.12
<Sentinel lymph node status prediction using self-attention networks and contrastive learning from routine histology images of primary tumours 논문리뷰> (0)	2022.08.18
<Self-Supervised Pre-Training of Swin Transformers for 3D Medical Image Analysis 리뷰> (0)	2022.08.18

현재글SegNet 논문 간단 리뷰 + segresnet

Find for your own rhythm

빅데이터분석기사, 빅데이터, 제3유형, 논문리뷰, 딥러닝, 실기, 빅데이터분석기사실기, 영상처리, 실기준비, 빅데이터분석기사 실기, 스위프트강의, neurips2024, 스위프트, SWIFT, 문제풀이, 인공지능, 스위프트강의추천, SwiftUI, 머신러닝, 분석기사 실기,

Today :
Yesterday :

일	월	화	수	목	금	토
				1	2	3
4	5	6	7	8	9	10
11	12	13	14	15	16	17
18	19	20	21	22	23	24
25	26	27	28	29	30	31

Find for your own rhythm