2024.07.05

기본 딥러닝 지식을 공부합니다.

Posted Jul 5, 2024

By Chaewon Mun

5 min read

< Pytorch >

Define by Run
실행을 하면서 그래프를 생성하는 방식 => pytorch
Define and Run
그래프를 먼저 정의하고 실행시점에 데이터를 주입하는 방식 => tensorflow
flatten() 메서드
tensor를 1차원으로 변환해준다
‘view’와 ‘reshape’의 차이점

: 두 메서드 모두 텐서의 차원을 바꾸는데 사용

view
텐서가 연속적(contiguous)인 경우에만 사용
텐서와 메모리 공유
reshape
텐서가 연속적(contiguous)이지 않은 경우에도 사용가능
텐서와 메모리 공유 + 필요시 새로운 메모리 할당 가능

mm 함수 & matmul 함수
1. torch.mm()
  두 2D 텐서간의 ‘행렬 곱’을 수행함
2. torch.matmul()
  두 텐서간 ‘행렬 곱’을 수행하며, 다양한 차원의 텐서에 대해 ‘broadcasting‘을 지원함
PyTorch project template의 특징
1. 로깅, 지표, 유틸리티 등의 코드를 각각의 파일에서 관리하여 사용하기 편리하다
2. 프로젝트의 여러 모듈들이 분리 돼 있어 코드를 재사용하기에 편리하다
3. 쉬운 코드 재현이 가능하다
getitem 매직 메서드의 특징
1. 전달받는 인덱스가 항상 정수일 필요 없다
2. 보통 ‘읽기 전용’ 연산으로 간주된다
3. 보통 하나의 인자만 받는다
4. 구현 시 ‘for’ 루프나 ‘in’ 연산자를 클래스 인스턴스에 사용할 수 있다
5. 구현 시 클래스 인스턴스를 리스트나 딕셔너리처럼 사용할 수 있다
nn.Parameter
1. ‘nn.Parameter’ 객체는 ‘nn.Module’ 내 attribute로 설정되면 ‘requires_grad = True’로 지정되어 학습 대상이 된다
2. ‘torch.Tensor’를 상속한 클래스로 동일한 기능을 가진다
3. ‘nn.Module’ 내에서 사용될 때 학습 대상이 된다
4. 자동으로 ‘requires_grad = True’ 로 설정된다
5. 모델의 gradient 계산이 가능하다. 모델의 학습을 위해 nn.Parameter를 사용해 가중치와 편향을 정의한다
backward() 메서드
실제값과 forward의 결과값 간의 loss에 대해 미분을 수행한다
Linear layer
‘in_features’ 차원의 벡터를 입력으로 받아, ‘out_features’ 차원으로 선형변환 후, bias를 더해주는 연산

  
class MyLinear(nn.Module):
    def __init__(self, in_features, out_features, bias=True):
        super().__init__()
        self.in_features = in_features
        self.out_features = out_features

        self.weights = nn.Parameter(torch.randn(in_features, out_features))
        self.bias = nn.Parameter(torch.randn(out_features))

    def forward(self, x: Tensor):
        return x @ self.weights + self.bias

CustomDataset
=> PyTorch 의 utils.data 모듈로부터 Dataset 클래스를 상속받아 CustomDataset을 구현할 때, 필수적으로 오버라이딩 해야하는 매직 메서드
getitem, init, len
DataLoader
주어진 데이터셋을 미니배치로 분할하고, 미니배치별로 학습을 할 수 있게 도와준다
torch.save()
모델 저장시,모델의 아키텍처와 가중치 를 함께 저장
CustomDataset 과 DataLoader를 구현

  
import torch
from torch.utils.data import Dataset, DataLoader

class CustomDataset(Dataset):
    def __init__(self, text, labels):
        self.labels = labels
        self.data = text

    def __len__(self):
        return len(self.labels)

    def __getitem__(self, idx):
        label = self.labels[idx]
        text = self.data[idx]
        sample = {"Text": text, "Class": label}
        # str 데이터가 숫자로 인코딩되지 않음 => torch.tensor 객체로 데이터 반환 x
        return sample

text = ['Happy', 'Amazing', 'Sad', 'Unhappy', 'Glum']
labels = ['Positive', 'Positive', 'Negative', 'Negative', 'Negative']
MyDataset = CustomDataset(text, labels)

MyDataLoader = DataLoader(MyDataset, batch_size=2, shuffle=False)
# 배치크기 2, 데이터 섞지 않음
print(next(iter(MyDataLoader)))
# {'Text': ['Happy', 'Amazing'], 'Class': ['Positive', 'Positive']} 출력

PyToch에서 학습 중간 결과를 저장하고 불러오기 위해 checkpoint 사용해 모델의 상태 불러오려면?

  
# 모델의 상태를 불러오려면 oad_state_dict 메서드를 사용
model.load_state_dict(checkpoint['model_state_dict'])

네이버 boostcourse의 인공지능 기초 다지기 퀴즈를 참조함

TIL, Machine Learning

This post is licensed under CC BY 4.0 by the author.

< Pytorch >

Trending Tags