[Tensor] vectorized op. (or universal func)

Numpy에서 제공하는 ufunc. (Universal Functions)은

homogeneous and contiguous tensor 에서
모든 elements에 같은 연산을 적용하여
기존의 반복 loop에 기반한 연산에 비해
압도적인 속도를 보이면서 간편한 연산자 하나로 처리하도록 지원해줌.

ufunc는 일종의 vectorized operation (or vector operation) 임.

실제로 ufunc은 homogeneous and contiguous에 최적화된

바이너리 구현물에 대한 binding이라고 볼 수 있음.

Python interface가 제공되어 쉽게 사용가능하지만,

내부적으로는 C, C++등의 컴파일 언어로 만들어진 바이너리 구현물이 동작한다고 보면 된다.

https://ds31x.tistory.com/208

[Programming] glue code and binding

glue의 사전적 의미는 "풀" (종이를 붙이는 딱풀 또는 접착제)을 의미한다. glue code란 말 그대로 연결 또는 붙여주는 코드를 가르킴. 붙이는 대상을 넣어서 정의하면 다음과 같음. "glue code"는 서로

ds31x.tistory.com

Vectorized Op.

Vector operations 은 scalar(단일 값)가 아닌 vector (or matrix) 전체에 대해 동일한 연산을 동시에 수행하는 방식을 가르키며,

다음의 장점을 가짐.

코드 간결화: 반복 루프를 사용하지 않고도 vector 또는 matrix 간의 연산(element-wise)을 간결하게 표현.
효율성 향상: vector operation 은 SIMD(Single Instruction, Multiple Data) 명령어를 활용하여 반복 루프를 사용하는 기존의 방식보다 훨씬 빠른 연산을 수행 (SIMD를 cpu등에서 지원해야함)

단점으로는 메모리등의 사용량은 보다 많이 요구되는 것을 들 수 있음.

vectorized op.는 NumPy, PyTorch, TensorFlow 에서 기본적으로 제공함.

예제 0

다음 예는 for loop와 vector operation 의 차이를 보여줌.

def add_vectors(x, y):
  """두 벡터를 더합니다."""
  result = []
  for i in range(len(x)):
    result.append(x[i] + y[i])
  return result

x = np.array([1, 2, 3])
y = np.array([4, 5, 6])

z = add_vectors(x, y)

print(z)
# [5 7 9]

import numpy as np

x = np.array([1, 2, 3])
y = np.array([4, 5, 6])

z = np.add(x, y)  # 벡터 덧셈

print(z)
# [5 7 9]

대표적인 연산 예제.

아래 예제들에 사용될 tensor 인스턴스를 초기화하는 코드.

import numpy as np
import torch
import tensorflow as tf

for i in [np,torch,tf]:
  print( f"{i.__name__}:{i.__version__}")
  
a = np.arange(6).reshape((3,2)).astype(np.float32)
b = np.ones((3,2),dtype=np.float32)
print(a.dtype, b.dtype)

a_t = torch.tensor(a).float()
b_t = torch.tensor(b).float()
print(a_t.dtype, b_t.dtype)

a_tf = tf.convert_to_tensor(a,dtype=tf.float32)
b_tf = tf.convert_to_tensor(b,dtype=tf.float32)
print(a_tf.dtype, b_tf.dtype)

addition

c = a + b
print(c)
c_t = a_t + b_t
print(c_t)
c_tf = a_tf + b_tf
print(c_tf)

mutliplication

c = a * b
print(c)
c_t = a_t * b_t
print(c_t)
c_tf = a_tf * b_tf
print(c_tf)

subtraction (or difference)

c = a - b
print(c)
c_t = a_t - b_t
print(c_t)
c_tf = a_tf - b_tf
print(c_tf)

division

c = a / b
print(c)
c_t = a_t / b_t
print(c_t)
c_tf = a_tf / b_tf
print(c_tf)

negation

c = -a 
print(c)
c_t = -a_t
print(c_t)
c_tf = -a_tf
print(c_tf)

power and exponentiation

the mathematical operation of raising a number to a power.

https://ds31x.blogspot.com/2023/08/math-exponential-vs-power.html

Math : exponential vs. power

exponent( exponential, 지수)은 constant가 base가 되고 variable이 exponent로 사용한다. 가장 쉽게 볼 수 있는 경우는 Euler’s number를 base로 하는 경우임. $x(t) = e^t$ 이에...

ds31x.blogspot.com

https://dsaint31.tistory.com/683

[Math] Exponential Function (지수함수)

지수함수 (exponential function) $a>0$ 이고 $a\ne 0$ 이면서 $x$ 가 real number(실수)일 때, 다음의 function을 exponential function이라고 한다. $y=a^x$ $a$ : base (밑수, 밑) $x$ : exponent or power (지수) $a$ to the $x$ th power, $a$

dsaint31.tistory.com

일반적인 operator를 이용하여서 쉽게 처리 가능.

c = a**2
print(c)
c_t = a_t**2
print(c_t)
c_tf = a_tf**2
print(c_tf)

다음은 ufunc를 이용한 예임 (NumPy의 경우 square를 통해 2제곱의 경우를 따로 지원).

c = np.square(a)
print(c)
c = np.power(a,2)
print(c)
print('-------------')
c_t = torch.pow(a_t,2)
print(c_t)
c_tf = tf.pow(a_tf,2)
print(c_tf)

Euler 상수에 지수로 사용되거나 밑수 2에 대한 제곱근 계산은 exp, exp2를 통해 지원.

c = np.exp(a)
print(c)
c_t = torch.exp(a_t)
print(c_t)
c_tf = tf.exp(a_tf)
print(c_tf)

c = np.exp2(a)
print(c)
c_t = torch.exp2(a_t)
print(c_t)
# c_tf = tf.exp2(a_tf) # not working
# print(c_tf)
c_tf = tf.pow(2.,a_tf)
print(c_tf)

log

기본적으로는 밑수가 Euler 상수 인 natural log를 log로 사용 (수학적 표기는 ln) 하나,
밑수가 2 인경우와 common log (밑수가 10) 인 경우도 제공 (log2, log10).

https://dsaint31.tistory.com/578

[Math] log (logarithmic) function

Definition of Logarithmic Function $a>0, a\ne1$ 일 때 $x>0$ 인 $x$ 에 대하여 $a^y=x$ 이면 $y=\log_a x$ 로 나타내고 $y$ 는 $a$ 를 base로 하는 logarithmic function 이라 한다. 이때, $x$ 를 $\log_a x$ 의 진수 (antilogarithm)라함. Exp

dsaint31.tistory.com

c = np.log(a)
print(c)
c_t = torch.log(a_t)
print(c_t)
c_tf = tf.math.log(a_tf)
print(c_tf)

c = np.log2(a)
print(c)
c_t = torch.log2(a_t)
print(c_t)
# c_tf = tf.math.log2(a_tf) not working
# print(c_tf)
c_tf = tf.experimental.numpy.log2(a_tf)
print(c_tf)

c = np.log10(a)
print(c)
c_t = torch.log10(a_t)
print(c_t)
# c_tf = tf.math.log10(a_tf) # not working
# print(c_tf)
c_tf = tf.experimental.numpy.log10(a_tf)
print(c_tf)

square root

c = np.sqrt(a)
print(c)
c_t = np.sqrt(a_t)
print(c_t)
c_tf = np.sqrt(a_tf)
print(c_tf)

absolute value (or magnitude)

Python의 built-in function인 abs도 사용가능하지만,

각 tensor가 속한 라이브러리의 abs를 사용하는게 효과적임.

c = np.abs(a)           # 엄밀하게는 np.absolute
print(c)
c_t = torch.abs(a_t)
print(c_t)
c_tf = tf.abs(a_tf)
print(c_tf)

relational operation.

> , < , >= , <= , != , == 등을 있음.

결과값이 boolean tensor 임.

(아래에 기술한 boolean opeartion 과 함께 사용되어 특정 조건에 해당하는 elements를 찾는데 사용됨.)

c = a > b
print(c)
c_t = a_t > b_t
print(c_t)
c_tf = a_tf > b_tf
print(c_tf)

trigonometric operations.

sine

c = np.sin(a)
print(c)
c_t = torch.sin(a_t)
print(c_t)
c_tf = tf.sin(a_tf)
print(c_tf)

cosine

c = np.cos(a)
print(c)
c_t = torch.cos(a_t)
print(c_t)
c_tf = tf.cos(a_tf)
print(c_tf)

tangent

c = np.tan(a)
print(c)
c_t = torch.tan(a_t)
print(c_t)
c_tf = tf.tan(a_tf)
print(c_tf)

Aggregations (집계)

아래 연산들은 특정 축을 따라 수행될 수 있음.
- 연산이 수행된 축은 연산의 결과에 의해 하나의 값으로 reduction되므로 해당 축의 크기는 1로 줄어들게 됨 ( $reduce$ ).
- numpy 와 tensorflow는 해당 축을 axis로 지정하며,
- pytorch 에서는 dim을 사용함.
NaN 가 있는 경우, 수행에 에러가 발생함. NaN을 무시하고 수행하는 NaN safe functions 도 제공됨.

2024.03.20 - [Python] - [Tensor] NaN Safe Aggregation Functions

[Tensor] NaN Safe Aggregation Functions

NaN (Not a Number) 값을 포함하는 Tensor 인스턴스에서 Aggregation Function을 사용할 때, NaN을 무시 또는 특정값으로 처리하는 기능을 제공하는 함수. NumPy 기존의 aggregaton function의 이름에 nan을 앞에 붙인

ds31x.tistory.com

https://dsaint31.tistory.com/216

NumPy : sum, mean, std, and so on

영상을 처리할 때, 영상의 각 pixel intensity에 대해 다양한 통계처리가 필요함. NumPy는 자체적으로 다양한 통계처리 함수들을 제공함. numpy.sum : 총합을 구함. Sum of array elements over a given axis. numpy.sum(

dsaint31.tistory.com

max

c = np.max(a)
print(c)
c_t = torch.max(a_t) # 특정 축을 지정할 경우, values와 indices를 반환함.
print(c_t)
c_tf = tf.reduce_max(a_tf)
print(c_tf)

min

c = np.min(a)
print(c)
c_t = torch.min(a_t) # 특정 축을 지정할 경우, values와 indices를 반환함.
print(c_t)
c_tf = tf.reduce_min(a_tf)
print(c_tf)

sum

c = np.sum(a)
print(c)
c_t = torch.sum(a_t)
print(c_t)
c_tf = tf.reduce_sum(a_tf)
print(c_tf)

mean (average, arithmatic mean)

c = np.mean(a, axis=0)
print(c)
c_t = torch.mean(a_t, dim=0)
print(c_t)
c_tf = tf.reduce_mean(a_tf, axis=0)
print(c_tf)

median

c = np.median(a, axis=0)
print(c)
c_t = torch.median(a_t, dim=0).values # indices도 구함.
print(c_t)
c_tf = tf.math.reduce_mean(a_tf, axis=0)
print(c_tf)

max와 min 에서 특정 축을 따라 계산하는 예제와 위치 (index)를 찾는 예제

https://gist.github.com/dsaint31x/a70c4ced5d5929b47d5725214fbee616

dl_tensor_max_min_argmax_argmin.ipynb

dl_tensor_max_min_argmax_argmin.ipynb. GitHub Gist: instantly share code, notes, and snippets.

gist.github.com

boolean operations

파이썬의 기본 bitwise op.와 유사함.

bm0 와 bm1 은 boolean tensor의 인스턴스이며 값은 다음과 같음

bm0= [[False False]
 [False  True]
 [ True  True]]
bm1= [[False False]
 [False False]
 [ True  True]]

보통 boolean tensor는 relational operatoin을 tensor에 가해질 경우 구해짐 (다음 code snippet 참고).

bm0 = a > 2.5
bm1 = a > 3.5
print('bm0=',bm0)
print('bm1=',bm1)
bm0_t = a_t > 2.5
bm1_t = a_t > 3.5
bm0_tf = a_tf > 2.5
bm1_tf = a_tf > 3.5

and 연산은 아래와 같이 & 를 이용함.

c = bm0 & bm1 # and
print(c)
c_t = bm0_t & bm1_t
print(c_t)
c_tf = bm0_tf & bm1_tf
print(c_tf)

or 연산은 아래와 같이 | 를 이용함.

c = bm0 | bm1 # or
print(c)
c_t = bm0_t | bm1_t
print(c_t)
c_tf = bm0_tf | bm1_tf
print(c_tf)

xor 연산의 아래와 같이 ^ 를 이용함.

c = bm0 ^ bm1 # xor
print(c)
c_t = bm0_t ^ bm1_t
print(c_t)
c_tf = bm0_tf ^ bm1_tf
print(c_tf)

not 연산은 아래와 같이 ~ 를 이용함.

c = ~bm0 # not
print(c)
c_t = ~bm0_t
print(c_t)
c_tf = ~bm0_tf
print(c_tf)

위의 경우 실제로는 대응되는 ufunc이 따로 있다.

예를 들면, bitwise_and, bitwise_or, bitwise_not, bitwise_xor 가 있음.

다음 예제를 참고하라. (tf의 경우, bitwise 관련하여 조금 다르게 사용을 해야한다.)

c = np.bitwise_and(bm0, bm1)
print(c)
c_t = torch.bitwise_and(bm0_t, bm1_t)
print(c_t)
c_tf = tf.bitwise.bitwise_and(tf.cast(bm0_tf,tf.int32), tf.cast(bm1_tf, tf.int32))
print(tf.cast(c_tf,tf.bool))

https://www.tensorflow.org/api_docs/python/tf/bitwise

Module: tf.bitwise | TensorFlow v2.15.0.post1

Public API for tf._api.v2.bitwise namespace

www.tensorflow.org

https://gist.github.com/dsaint31x/7cdfcce8bdf6abad833b4fd4bb021db0

dl_tensor_ufunc.ipynb

dl_tensor_ufunc.ipynb. GitHub Gist: instantly share code, notes, and snippets.

gist.github.com

'Python' 카테고리의 다른 글

[DL] Storage: PyTorch 텐서를 위한 메모리 관리 (0)	2024.03.21
[Tensor] NaN Safe Aggregation Functions (0)	2024.03.20
[ML] where: numpy 의 idx찾기 (2)	2024.03.19
[DL] Tensor: Indexing <Simple, Slicing, Fancy, Boolean Mask> (0)	2024.03.18
[DL] Tensor: Transpose and Permute (2)	2024.03.16

내 블로그 - 관리자 홈 전환	`Q` `Q`
새 글 쓰기	`W` `W`

글 수정 (권한 있는 경우)	`E` `E`
댓글 영역으로 이동	`C` `C`

이 페이지의 URL 복사	`S` `S`
맨 위로 이동	`T` `T`
티스토리 홈 이동	`H` `H`
단축키 안내	`Shift` + `/` `⇧` + `/`

ds31x

[Tensor] vectorized op. (or universal func)

Vectorized Op.

예제 0

대표적인 연산 예제.

addition

mutliplication

subtraction (or difference)

division

negation

power and exponentiation

log

square root

absolute value (or magnitude)

relational operation.

trigonometric operations.

Aggregations (집계)

boolean operations

'Python' 카테고리의 다른 글

티스토리툴바

개인정보

단축키

내 블로그

블로그 게시글

모든 영역

[Tensor] vectorized op. (or universal func)

Vectorized Op.

예제 0

대표적인 연산 예제.

addition

mutliplication

subtraction (or difference)

division

negation

power and exponentiation

log

square root

absolute value (or magnitude)

relational operation.

trigonometric operations.

Aggregations (집계)

boolean operations

'Python' 카테고리의 다른 글

관련글

티스토리툴바

개인정보

단축키

내 블로그

블로그 게시글

모든 영역