R 프로그래밍(2) - 기본 사용법과 Vector

" >> " 뒤에 오는 것은 앞 줄에서 실행한 입력값에 대한 출력 결과를 의미한다.

02 R 기본 사용법

1. 기본 연산자

4 / 2 # 나눗셈

5 %/% 2 # 몫

5%%2 # 나머지

(3+2)^3 # 거듭제곱

# 변수명 할당 (구성원리: 영문자/숫자/마침표 조합, 반드시 영문자로 시작, 대소문자 구별)

2. Data Type과 함수 class

class(TRUE)

>> [1] “logical”

class(T)

>> [1] “logical”

class(12L)

>> [1] “integer”

class(3 + 2i)

>> [1] “complex”

class(12.3)

>> [1] “numeric”

as.numeric(12L)

>> [1] 12

class(‘a’)

>> [1] “character”

class(“good”)

>> [1] “character”

class(‘2 + 4’)

>> [1] “character”

3. Data Structure

a <- c(‘red’, ‘green’, ‘yellow’)

class(a)

>> [1] “character”

str(a)

>> chr [1:3] “red” “green” “yellow”

is.vector(a)

>> [1] TRUE

b <- c(12, 13.5, 0)

class(b)

>> [1] “numeric”

str(b)

>> num [1:3] 12 13.5 0

f <- factor(c(‘green’, ‘green’, ‘yellow’, ‘red’, ‘red’, ‘red’, ‘green’))

class(f)

>> [1] “factor”

str(f)

>> Factor w/ 3 levels “green” “red” “yellow”: 1 1 3 2 2 2 1

is.factor(f)

>> [1] TRUE

g <- data.frame(gender = c(‘Male’, ‘Male’, ‘Female’), height = c(152, 171, 165))

class(g)

>> [1] “data.frame”

str(g)

>> ‘data.frame’ : 3 obs. of 4 variables …

is.data.frame(g)

>> [1] TRUE

dim(g)

>> [1] 3 2

4. 변수 삭제

rm(a)

>> 변수 a 삭제

rm(list = ls())

>> 모두 삭제

03 Vector 벡터

1. 벡터의 생성과 연산

1) 종류

is.vector(“apple”)

>> [1] TRUE

str(“apple”)

>> chr “apple”

str(1.25)

>> num 1.25

str(3L)

>> int 3

str(TRUE)

>> logi TRUE

str(2 + 3i)

>> cplx 2 + 3i

2) Vector 만들기

>> [1] 1

c(1)

>> [1] 1

c(1, 2, 3)

>> [1] 1 2 3

1:5

>> [1] 1 2 3 4 5

class(a); class(b)

>> [1] “numeric”

[1] “character”

* 두 개 이상의 원소를 포함할 수 있음

c <- 5.5 : 20.4

>> [1] 5.5 6.5 … 19.5

d <- 5.5 : 20.6

>> [1] 5.5 6.5 … 20.5

3) Vector 연산

x <- c(1, 3, 5, 7, 9)

y <- c(2, 4, 6, 8, 10)

x + y

>> [1] 3 7 11 15 19

x * c(2, 4, 5)

>> [1] 2 12 25 14 36

--- Warning Message

4) 문자열과 변수

x <- c(‘A’, ‘B’, ‘C’)

y <- c(“a”, “b”, “c”)

z <- c(x, y)

>> [1] “A” “B” “C” “a” “b” “c”

5) 단일한 유형의 값으로 구성되는 벡터

a <- c(1, 2, “3”)

>> [1] “1” “2” “3”

2. 벡터의 인덱싱과 비교연산자

1) 내장변수 Built-in variable

letters

>> [1] “a” “b” … “z”

LETTERS

>> [1] “A” “B” … “Z”

month.name

>> [1] “January” “February” … “December”

month.abb

>> [1] “Jan” “Feb” … “Dec”

2) Vector and Indexing

month.abb[1]

>> [1] “Jan”

month.abb[1:3]

>> [1] “Jan” “Feb “Mar”

month.abb[c(1, 3, 5)]

>> [1] “Jan” “Mar” “May”

month.abb[c(2, 1, 1, 3)]

>> [1] “Feb” “Jan” “Jan” “Mar”

month.abb[c(-1, -3, -5, -12)]

>> [1] “Feb” “Apr” “Jun” “Jul” “Aug” “Sep” “Oct” “Nov”

month.abb[-c(1, 3, 5, 12)]

>> same as above

month.abb[-c(1:5)]

>> [1] “Jun” … “Dec”

month.abb[-1:5]

>> Error(only 0’s may be mix)

month.abb[1:3][c(TRUE, FALSE, TRUE)]

>> [1] “Jan” “Mar”

month.abb[c(TRUE, FALSE, TRUE)]

>> [1] “Jan” “Mar” “Apr” “Jun” “Jul” “Sep” “Oct” “Dec”

month.abb[1:3][c(1, 0, 1)]

>> [1] “Jan” “Jan”

즉, 0과 1은 logical 이 아닌 인덱스의 기능을 함

3) 비교/논리 연산자

& and

| or

>, <, >=, <= greater/less

!=, == not equal, equal

month.abb == ‘Feb’ | month.abb == ‘Jan’

>> [1] TRUE TRUE FALSE FALSE FALSE …

month.abb == ‘Feb’ | ‘Jan’

>> Error(possible only for numeric, logical …)

month.abb != ‘Feb’ | month.abb != ‘Jan’

month.abb != ‘Feb’ & month.abb != ‘Jan’

4) 비교/논리 연산자와 인덱싱

month.abb[month.abb == ‘Feb’]

>> [1] “Feb”

month.abb[month.abb == ‘Feb’ | month.abb == ‘Jan’]

>> [1] “Jan” “Feb”

month.abb[‘Jan’]

>> [1] NA

month.abb[c(‘Jan’, “Mar’)]

>> [1] NA NA

month.abb[month.abb[1:2]]

>> [1] NA NA

3. 벡터와 함수 1

a <- 1:5

length(a)

>> [1] 5

sum(a)

>> [1] 15

mean(a)

>> [1] 3

1) Sample 함수

data <- 1:3

sample(data, size = 5, replace = T)

>> [1] 2 3 1 1 3 (랜덤 표본추출)

sample(data, 5, T)

>> [1] 2 2 3 3 3

sample(data, 5, T, prob = c(0.2, 0.2, 0.8))

>> [1] 3 3 2 3 3

2) Str 함수

x <- sample(10)

>> [1] 7 3 6 2 9 10 5 4 1 8

y <- sample(letters, 10, replace = F)

>> [1] “l” “n” “v” … “g”

str(x)

>> int [1:10] 7 3 6 2 9 10 5 4 1 8

str(y)

>> chr [1:10] “l” “n” “v” … “g”

3) Rep 함수

rep(c(1, 2, 3), 4)

>> [1] 1 2 3 1 2 3 1 2 3 1 2 3

rep(sample(3), 4)

>> [1] 1 2 3 1 2 3 1 2 3 1 2 3

rep(sample(3), 4)

>> [1] 2 1 3 2 1 3 2 1 3 2 1 3

rep(c(1,2,3), times = 4)

>> [1] 1 2 3 1 2 3 1 2 3 1 2 3

rep(c(1,2,3), each = 4)

>> [1] 1 1 1 1 2 2 2 2 3 3 3 3

rep(1:3, 1:3)

>> [1] 1 2 2 3 3 3

rep(1:3, 1:2)

>> Error

rep(1:3, 3:1)

>> [1] 1 1 1 2 2 3

rep(1:3, c(2,4,6))

>> [1] 1 1 2 2 2 2 3 3 3 3 3 3

rep(c(1,2,3), times = 1:3)

>> [1] 1 2 2 3 3 3

rep(c(1,2,3), each = 1:3)

>> [1] 1 2 3

Warning Message

4) Seq 함수

seq(1, 10)

>> [1] 1 2 3 4 5 6 7 8 9 10

seq(from = 1, to = 10)

>> same as above

seq(1, 10, 1)

>> same as above

seq(from = 1, to = 10, by = 1)

>> same as above

seq(1, 10, 2)

>> [1] 1 3 5 7 9

seq(by = 2, to = 10, from = 3)

>> [1] 3 5 7 9

seq(10, 2, 3)

>> Error

seq(10, -10, -2)

>> [1] 10 8 6 4 2 0 -2 -4 -6 -8 -10

seq(1, 8, length = 5)

>> [1] 1.00 2.75 4.50 6.25 8.00

seq(1, 8, length.out = 5)

>> same as above

seq(1, by = 3, length = 5)

>> [1] 1 4 7 10 13

seq(1, by = 3, length.out = 5)

>> same as above

-연습문제

letters[rep(1:length(letters), times = 1:length(letters))]

>> [1] “a” “b” “b” “c” “c” “c” …

letters[seq(1, length(letters), 2)]

>> [1] “a” “c” “e” …

4. 벡터와 함수 2

1) 데이터 타입 변환

x <- 1:5

as.numeric(x)

>> [1] 1 2 3 4 5

*** 하지만 x가 별도로 num으로 바뀌진 않음

class(as.numeric(x))

>> [1] “numeric”

str(as.numeric(x))

>> num [1:5] 1 2 3 4 5

as.character(x)

>> [1] “1” “2” “3” “4” “5”

class(as.character(x))

>> [1] “character”

str(as.character(x))

>> chr [1:5] “1” “2” “3” “4” “5”

y <- seq(1.5, 5, 1); y

>> [1] 1.5 2.5 3.5 4.5

as.integer(y)

>> [1] 1 2 3 4

z <- letters[1:5]; z

>> [1] “a” “b” “c” “d” “e”

as.numric(z)

>> [1] NA NA NA NA NA (Warning Message)

2) 함수 names

x <- 1:3

names(x) <- c(“one”, “two”, “three”)

>> one two three

1 2 3

class(names)

>> [1] “integer”

str(x)

>> Named int [1:3] 1 2 3

-attr(*, “names”) = chr [1:3] “one” “two” “three”

names(x)

>> [1] “one” “two” “three”

unname(x)

>> [1] 1 2 3

*** 하지만 x 자체에서 name이 빠지진 않음

x[1]

>> one

x[1:2]

>> one two

1 2

x[c(‘one’,’three’)]

>> one three
1 3

x[‘one’ : ‘two’]

>> Error

3) 함수 print VS 함수 cat

print(x)

>> one two three

1 2 3

print(names(x))

>> [1] “one” “two” “three”

print(unname(x))

>> [1] 1 2 3

cat(x, ‘\n’)

>> 1 2 3

cat(names(x), ‘\n’)

>> one two three

cat(unname(x), ‘\n’)

>> 1 2 3

cat(as.vector(x), ‘\n’)

>> 1 2 3

*** ‘\n’ 안 쓰면 다음 줄로 안 넘어감

a <- 1:3

b <- print(a)

>> [1] 1 2 3

c <- cat(a)

>> 1 2 3

d <- str(a)

>> int [1:3] 1 2 3

>> [1] 1 2 3

>> NULL

*** str과 cat 함수를 적용한 변수는 NULL 출력

4) 함수 round

x <- seq(3.4, 3.49, 0.01)

>> [1] 3.40 3.41 … 3.49

round(x, 1)

>> [1] 3.4 3.4 … 3.5 3.5

*** 3.46부터 3.5로 round(IEEE 기준)

round(seq(1.1, 1.19, 0.01), 1)

>> [1] 1.1 1.1 … 1.2 1.2

***이진수 연산에 따른 오차로 5번째부터 올림되기도, 6번째부터 올림되기도..

5) 함수 which

x <- 10:1

x == 4

>> [1] FALSE …TRUE FALSE FALSE FALSE

which(x == 4)

>> [1] 7

which(x > 3 & x < 6)

>> [1] 6 7

x[x > 3 & x < 6]

>> [1] 5 4

x[which(x > 3 & x < 6)]

>> [1] 5 4

6) 함수 length

length(letters)

>> [1] 26

length(which(letters==”a”|letters=”b”))

>> [1] 2

length(letters==’a’|letters==’b’)

>> [1] 26

length(letters!=’a’&letters!=’b’)

>> [1] 26

7) 함수 sum

x <- 10:1

sum(x)

>> [1] 55

sum(x == 4)

>> [1] 1

sum(x > 8 | x < 3)

>> [1] 4 *** 개수 출력

8) 함수 table

table(x)

>> x

1 2 3 4 5 6 7 8 9 10

1 1 1 1 1 1 1 1 1 1

class(table(x))

>> [1] “table”

table(x == 4)

>> FALSE TRUE

9 1

table(x > 8 | x < 3)

>> FALSE TRUE

6 4

9) 값 편집

x <- 10:1

x[which(x > 8)] <- NA

>> [1] NA NA 8 7 6 5 4 3 2 1

x[x < 3] <- NA

>> [1] NA NA 8 7 6 5 4 3 NA NA

10) Value Matching

x <- c(“a”, “b”, “c”, “d”)

y <- c(“g”, “x”, “d”, “e”, “f”, “a”, “c”)

match(x, y)

>> [1] 6 NA 7 3

x %in% y

>> [1] TRUE FALSE TRUE TRUE

x <- c(“a”, “b”, “c”, “d”)

y <- c(“g”, “a”, “d”, “e”, “c”, “a”, “c”)

match(x, y)

>> [1] 2 NA 5 3

x %in% y

>> [1] TRUE FALSE TRUE TRUE

which(y %in% x)

>> [1] 2 3 5 6 7

11) 집합론 함수

unique(x)

>> [1] “a” “b” “c” “d”

unique(y)

>> [1] “g” “a” “d” “e” “c”

union(x, y)

>> [1] “a” “b” “c” “d” “g” “e”

union(y, x)

>> [1] “g” “a” “d” “e” “c” “b”

intersect(x, y)

>> [1] “a” “c” “d”

intersect(y, x)

>> [1] “a” “d” “c”

setdiff(x, y)

>> [1] “b”

setdiff(y, x)

>> [1] “g” “e”

x <- 1:10

any(x > 8)

>> [1] TRUE

any(x > 10)

>> [1] FALSE

all(x > 8)

>> [1] FALSE

all(x > 0)

>> [1] TRUE

12) 벡터 정렬

x <- c(“a”, “b”, “c”, “d”)

y <- c(“g”, “a”, “d”, “e”, “c”, “a”, “c”)

sort(x)

>> [1] “a” “b” “c” “d”

sort(x, decreasing = T)

>> [1] “d” “c” “b” “a”

sort(y)

>> [1] “a” “a” “c” “c” “d” “e” “g”

order(x)

>> [1] 1 2 3 4

order(x, decreasing = T)

>> [1] 4 3 2 1

order(y)

>> [1] 2 6 5 7 3 4 1

order(y, decreasing = T)

>> [1] 1 4 3 5 7 2 6

5. 텍스트파일 불러오기

1) 클립보드에서 불러오기

텍스트 파일을 메모장 등에서 열고 Ctrl + A(전체선택), Ctrl + C 후 입력

TEXT <- scan(file = ‘clipboard’, what = ‘char’, quote = NULL)

2) 파일명으로 불러오기

파일 – 작업디렉토리 변경 – File – Change Dir 메뉴에서 선택

TEXT <- scan(file = ‘03_WhatIsR.txt’, what = ‘char’, quote = NULL)

3) 파일 열기/선택창에서 파일 선택하기

TEXT <- scan(file = file.choose(), what = ‘char’, quote = NULL)

>> Read 486 items

6. 불러온 데이터 출력

1) 벡터 처음/마지막 원소 보기

head(TEXT), tail(TEXT)

>> 6개씩 출력

head(TEXT, 10)

tail(TEXT, 10)

2) 조건을 이용한 검색 및 추출

TEXT[TEXT == ‘a’]

>> [1] “a” “a” “a” …

length(TEXT[TEXT == “the”])

>> [1] 14

3) 값의 편집

TEXT[TEXT == ‘an’] <- ‘a’

TEXT[TEXT == ‘an’]

>> character(0)

4) 벡터파일 저장

cat(TEXT, file = “vector.txt”, sep = ‘\n’)

'R Programming > Notes' 카테고리의 다른 글

R 프로그래밍(3) - Factor & DataFrame (0)	2021.02.18
R 프로그래밍(1) - 코퍼스 언어학이란? (1)	2021.02.15

R 프로그래밍(2) - 기본 사용법과 Vector

02 R 기본 사용법

1. 기본 연산자

2. Data Type과 함수 class

3. Data Structure

4. 변수 삭제

03 Vector 벡터

1. 벡터의 생성과 연산

1) 종류

2) Vector 만들기

3) Vector 연산

4) 문자열과 변수

5) 단일한 유형의 값으로 구성되는 벡터

2. 벡터의 인덱싱과 비교연산자

1) 내장변수 Built-in variable

2) Vector and Indexing

3) 비교/논리 연산자

4) 비교/논리 연산자와 인덱싱

3. 벡터와 함수 1

1) Sample 함수

2) Str 함수

3) Rep 함수

4) Seq 함수

4. 벡터와 함수 2

1) 데이터 타입 변환

2) 함수 names

3) 함수 print VS 함수 cat

4) 함수 round

5) 함수 which

6) 함수 length

7) 함수 sum

8) 함수 table

9) 값 편집

10) Value Matching

11) 집합론 함수

12) 벡터 정렬

5. 텍스트파일 불러오기

1) 클립보드에서 불러오기

2) 파일명으로 불러오기

3) 파일 열기/선택창에서 파일 선택하기

6. 불러온 데이터 출력

1) 벡터 처음/마지막 원소 보기

2) 조건을 이용한 검색 및 추출

3) 값의 편집

4) 벡터파일 저장

'R Programming > Notes' 카테고리의 다른 글

'R Programming/Notes' Related Articles

티스토리툴바