퍼셉트론

개발

2024.03.13 15:17

퍼셉트론의 구조

입력값마다 다른 weight을 곱한 값을 모두 더하고 여기에 bias라고 불리는 값을 더한다.

더해진 총 합은 활성화 함수에 적용, 활성화 수준을 계산한 값이 출력된다.

여기서, 출력값과 목표 값이 다른 경우 Error를 통해 가중치를 업데이트한다.

결국 학습이라는 것은 이 가중치를 반복적으로 조정하면서 알맞은 가중치와 bias, 즉 학습 목표인 두 부류로 선형분리하기 위한 학습 벡터를 찾아내는 과정이라고 볼 수 있다.

단층 퍼셉트론의 구현

import tensorflow as tf
import numpy as np

def step_func(x): # 계단함수
  return (x >= 0) * 1

def sigmoid(x):
  return 1 / (1 + np.exp(-x))

fun = step_func

x = np.array([[1,1],[1,0],[0,1],[0,0]])
y = np.array([[1], [0], [0], [0]])
w = tf.random.normal([2], 0, 1) # 초기 가중치
b = 1.0 # bias 초기값
w0 = 0.5
a = 0.1 # 학습률

for i in range(1000):
  error_sum = 0
  for j in range(4):
    output = fun(np.sum(x[j] * w) + b * w0)
    error = y[j][0] - output # 오차 = 예측값 - 실제 출력값
    w = w + x[j] * a * error
    b = b + a * error
    error_sum += error

    if i % 50 == 0:
      print(i, error_sum)

for i in range(4):
  if fun(np.sum(x[i] * w) + b * w0) > 0.0: # sigmoid인 경우 0.5로 설정
    output = 1.0
  else:
    output = 0.0

  print('X: ', x[i], ' Y: ', y[i], ' Output: ', fun(np.sum(x[i] * w) + b * w0), " result: ", output)

# 출력 : step_func
X: [1 1] Y: [1] output: 1 result: 1.0
X: [1 0] Y: [0] output: 0 result: 0.0
X: [0 1] Y: [0] output: 0 result: 0.0
X: [0 0] Y: [0] output: 0 result: 0.0

# 출력 : sigmoid
X:  [1 1]  Y:  [1]  Output:  0.9058413807262921  result:  1.0
X:  [1 0]  Y:  [0]  Output:  0.07412882460706796  result:  0.0
X:  [0 1]  Y:  [0]  Output:  0.07442403891386311  result:  0.0
X:  [0 0]  Y:  [0]  Output:  0.000668736638863177  result:  0.0

결과가 정확하거나 유사하게 나오는 것을 볼 수 있다.

단층 퍼셉트론의 한계

xor과 같은 경우에는 하나의 직선으로 선형분리할 수 없다는 한계가 있다.

X:  [1 1]  Y:  [0]  Output:  0.5095529382491947  result:  1.0
X:  [1 0]  Y:  [1]  Output:  0.5095529568688468  result:  1.0
X:  [0 1]  Y:  [1]  Output:  0.5031846645024431  result:  1.0
X:  [0 0]  Y:  [0]  Output:  0.5031846831281389  result:  1.0

다층 퍼셉트론

NAND와 OR을 AND한 결과를 내어, 계층을 두개로 만들면 (다층 퍼셉트론을 만들면) 해결할 수 있다.

import numpy as np

def AND(x1,x2):
    x = np.array([x1,x2])
    w = np.array([0.5,0.5])
    b = -0.7
    tmp = np.sum(w*x)+b
    if tmp <= 0 :
        return 0
    else :
        return 1

def NAND(x1,x2):
    x = np.array([x1,x2])
    w = np.array([-0.5,-0.5])
    b = 0.7
    tmp = np.sum(w*x)+b
    if tmp <= 0 :
        return 0
    else :
        return 1

def OR(x1,x2):
    x = np.array([x1,x2])
    w = np.array([0.5,0.5])
    b = -0.2
    tmp = np.sum(w*x)+b
    if tmp <= 0 :
        return 0
    else :
        return 1

def XOR(x1, x2):
    s1 = NAND(x1, x2)
    s2 = OR(x1, x2)
    y = AND(s1, s2)
    return y

print(XOR(0,0))
print(XOR(0,1))
print(XOR(1,0))
print(XOR(1,1))

// 결과
0
1
1
0

aiRNN aiVector Search ai선형회귀 ai활성화함수 reactnativeNew Architecture reactnativeReact Native