" >> " 뒤에 오는 것은 앞 줄에서 실행한 입력값에 대한 출력 결과를 의미한다.
02 R 기본 사용법
1. 기본 연산자
4 / 2 # 나눗셈
5 %/% 2 # 몫
5%%2 # 나머지
(3+2)^3 # 거듭제곱
# 변수명 할당 (구성원리: 영문자/숫자/마침표 조합, 반드시 영문자로 시작, 대소문자 구별)
2. Data Type과 함수 class
class(TRUE)
>> [1] “logical”
class(T)
>> [1] “logical”
class(12L)
>> [1] “integer”
class(3 + 2i)
>> [1] “complex”
class(12.3)
>> [1] “numeric”
as.numeric(12L)
>> [1] 12
class(‘a’)
>> [1] “character”
class(“good”)
>> [1] “character”
class(‘2 + 4’)
>> [1] “character”
3. Data Structure
a <- c(‘red’, ‘green’, ‘yellow’)
class(a)
>> [1] “character”
str(a)
>> chr [1:3] “red” “green” “yellow”
is.vector(a)
>> [1] TRUE
b <- c(12, 13.5, 0)
class(b)
>> [1] “numeric”
str(b)
>> num [1:3] 12 13.5 0
f <- factor(c(‘green’, ‘green’, ‘yellow’, ‘red’, ‘red’, ‘red’, ‘green’))
class(f)
>> [1] “factor”
str(f)
>> Factor w/ 3 levels “green” “red” “yellow”: 1 1 3 2 2 2 1
is.factor(f)
>> [1] TRUE
g <- data.frame(gender = c(‘Male’, ‘Male’, ‘Female’), height = c(152, 171, 165))
class(g)
>> [1] “data.frame”
str(g)
>> ‘data.frame’ : 3 obs. of 4 variables …
is.data.frame(g)
>> [1] TRUE
dim(g)
>> [1] 3 2
4. 변수 삭제
rm(a)
>> 변수 a 삭제
rm(list = ls())
>> 모두 삭제
03 Vector 벡터
1. 벡터의 생성과 연산
1) 종류
is.vector(“apple”)
>> [1] TRUE
str(“apple”)
>> chr “apple”
str(1.25)
>> num 1.25
str(3L)
>> int 3
str(TRUE)
>> logi TRUE
str(2 + 3i)
>> cplx 2 + 3i
2) Vector 만들기
1
>> [1] 1
c(1)
>> [1] 1
c(1, 2, 3)
>> [1] 1 2 3
1:5
>> [1] 1 2 3 4 5
class(a); class(b)
>> [1] “numeric”
[1] “character”
* 두 개 이상의 원소를 포함할 수 있음
c <- 5.5 : 20.4
>> [1] 5.5 6.5 … 19.5
d <- 5.5 : 20.6
>> [1] 5.5 6.5 … 20.5
3) Vector 연산
x <- c(1, 3, 5, 7, 9)
y <- c(2, 4, 6, 8, 10)
x + y
>> [1] 3 7 11 15 19
x * c(2, 4, 5)
>> [1] 2 12 25 14 36
--- Warning Message
4) 문자열과 변수
x <- c(‘A’, ‘B’, ‘C’)
y <- c(“a”, “b”, “c”)
z <- c(x, y)
z
>> [1] “A” “B” “C” “a” “b” “c”
5) 단일한 유형의 값으로 구성되는 벡터
a <- c(1, 2, “3”)
a
>> [1] “1” “2” “3”
2. 벡터의 인덱싱과 비교연산자
1) 내장변수 Built-in variable
letters
>> [1] “a” “b” … “z”
LETTERS
>> [1] “A” “B” … “Z”
month.name
>> [1] “January” “February” … “December”
month.abb
>> [1] “Jan” “Feb” … “Dec”
2) Vector and Indexing
month.abb[1]
>> [1] “Jan”
month.abb[1:3]
>> [1] “Jan” “Feb “Mar”
month.abb[c(1, 3, 5)]
>> [1] “Jan” “Mar” “May”
month.abb[c(2, 1, 1, 3)]
>> [1] “Feb” “Jan” “Jan” “Mar”
month.abb[c(-1, -3, -5, -12)]
>> [1] “Feb” “Apr” “Jun” “Jul” “Aug” “Sep” “Oct” “Nov”
month.abb[-c(1, 3, 5, 12)]
>> same as above
month.abb[-c(1:5)]
>> [1] “Jun” … “Dec”
month.abb[-1:5]
>> Error(only 0’s may be mix)
month.abb[1:3][c(TRUE, FALSE, TRUE)]
>> [1] “Jan” “Mar”
month.abb[c(TRUE, FALSE, TRUE)]
>> [1] “Jan” “Mar” “Apr” “Jun” “Jul” “Sep” “Oct” “Dec”
month.abb[1:3][c(1, 0, 1)]
>> [1] “Jan” “Jan”
즉, 0과 1은 logical 이 아닌 인덱스의 기능을 함
3) 비교/논리 연산자
& and
| or
>, <, >=, <= greater/less
!=, == not equal, equal
month.abb == ‘Feb’ | month.abb == ‘Jan’
>> [1] TRUE TRUE FALSE FALSE FALSE …
month.abb == ‘Feb’ | ‘Jan’
>> Error(possible only for numeric, logical …)
month.abb != ‘Feb’ | month.abb != ‘Jan’
month.abb != ‘Feb’ & month.abb != ‘Jan’
4) 비교/논리 연산자와 인덱싱
month.abb[month.abb == ‘Feb’]
>> [1] “Feb”
month.abb[month.abb == ‘Feb’ | month.abb == ‘Jan’]
>> [1] “Jan” “Feb”
month.abb[‘Jan’]
>> [1] NA
month.abb[c(‘Jan’, “Mar’)]
>> [1] NA NA
month.abb[month.abb[1:2]]
>> [1] NA NA
3. 벡터와 함수 1
a <- 1:5
length(a)
>> [1] 5
sum(a)
>> [1] 15
mean(a)
>> [1] 3
1) Sample 함수
data <- 1:3
sample(data, size = 5, replace = T)
>> [1] 2 3 1 1 3 (랜덤 표본추출)
sample(data, 5, T)
>> [1] 2 2 3 3 3
sample(data, 5, T, prob = c(0.2, 0.2, 0.8))
>> [1] 3 3 2 3 3
2) Str 함수
x <- sample(10)
x
>> [1] 7 3 6 2 9 10 5 4 1 8
y <- sample(letters, 10, replace = F)
y
>> [1] “l” “n” “v” … “g”
str(x)
>> int [1:10] 7 3 6 2 9 10 5 4 1 8
str(y)
>> chr [1:10] “l” “n” “v” … “g”
3) Rep 함수
rep(c(1, 2, 3), 4)
>> [1] 1 2 3 1 2 3 1 2 3 1 2 3
rep(sample(3), 4)
>> [1] 1 2 3 1 2 3 1 2 3 1 2 3
rep(sample(3), 4)
>> [1] 2 1 3 2 1 3 2 1 3 2 1 3
rep(c(1,2,3), times = 4)
>> [1] 1 2 3 1 2 3 1 2 3 1 2 3
rep(c(1,2,3), each = 4)
>> [1] 1 1 1 1 2 2 2 2 3 3 3 3
rep(1:3, 1:3)
>> [1] 1 2 2 3 3 3
rep(1:3, 1:2)
>> Error
rep(1:3, 3:1)
>> [1] 1 1 1 2 2 3
rep(1:3, c(2,4,6))
>> [1] 1 1 2 2 2 2 3 3 3 3 3 3
rep(c(1,2,3), times = 1:3)
>> [1] 1 2 2 3 3 3
rep(c(1,2,3), each = 1:3)
>> [1] 1 2 3
Warning Message
4) Seq 함수
seq(1, 10)
>> [1] 1 2 3 4 5 6 7 8 9 10
seq(from = 1, to = 10)
>> same as above
seq(1, 10, 1)
>> same as above
seq(from = 1, to = 10, by = 1)
>> same as above
seq(1, 10, 2)
>> [1] 1 3 5 7 9
seq(by = 2, to = 10, from = 3)
>> [1] 3 5 7 9
seq(10, 2, 3)
>> Error
seq(10, -10, -2)
>> [1] 10 8 6 4 2 0 -2 -4 -6 -8 -10
seq(1, 8, length = 5)
>> [1] 1.00 2.75 4.50 6.25 8.00
seq(1, 8, length.out = 5)
>> same as above
seq(1, by = 3, length = 5)
>> [1] 1 4 7 10 13
seq(1, by = 3, length.out = 5)
>> same as above
-연습문제
letters[rep(1:length(letters), times = 1:length(letters))]
>> [1] “a” “b” “b” “c” “c” “c” …
letters[seq(1, length(letters), 2)]
>> [1] “a” “c” “e” …
4. 벡터와 함수 2
1) 데이터 타입 변환
x <- 1:5
as.numeric(x)
>> [1] 1 2 3 4 5
*** 하지만 x가 별도로 num으로 바뀌진 않음
class(as.numeric(x))
>> [1] “numeric”
str(as.numeric(x))
>> num [1:5] 1 2 3 4 5
as.character(x)
>> [1] “1” “2” “3” “4” “5”
class(as.character(x))
>> [1] “character”
str(as.character(x))
>> chr [1:5] “1” “2” “3” “4” “5”
y <- seq(1.5, 5, 1); y
>> [1] 1.5 2.5 3.5 4.5
as.integer(y)
>> [1] 1 2 3 4
z <- letters[1:5]; z
>> [1] “a” “b” “c” “d” “e”
as.numric(z)
>> [1] NA NA NA NA NA (Warning Message)
2) 함수 names
x <- 1:3
names(x) <- c(“one”, “two”, “three”)
x
>> one two three
1 2 3
class(names)
>> [1] “integer”
str(x)
>> Named int [1:3] 1 2 3
-attr(*, “names”) = chr [1:3] “one” “two” “three”
names(x)
>> [1] “one” “two” “three”
unname(x)
>> [1] 1 2 3
*** 하지만 x 자체에서 name이 빠지진 않음
x[1]
>> one
1
x[1:2]
>> one two
1 2
x[c(‘one’,’three’)]
>> one three
1 3
x[‘one’ : ‘two’]
>> Error
3) 함수 print VS 함수 cat
print(x)
>> one two three
1 2 3
print(names(x))
>> [1] “one” “two” “three”
print(unname(x))
>> [1] 1 2 3
cat(x, ‘\n’)
>> 1 2 3
cat(names(x), ‘\n’)
>> one two three
cat(unname(x), ‘\n’)
>> 1 2 3
cat(as.vector(x), ‘\n’)
>> 1 2 3
*** ‘\n’ 안 쓰면 다음 줄로 안 넘어감
a <- 1:3
b <- print(a)
>> [1] 1 2 3
c <- cat(a)
>> 1 2 3
d <- str(a)
>> int [1:3] 1 2 3
b
>> [1] 1 2 3
c
>> NULL
d
>> NULL
*** str과 cat 함수를 적용한 변수는 NULL 출력
4) 함수 round
x <- seq(3.4, 3.49, 0.01)
x
>> [1] 3.40 3.41 … 3.49
round(x, 1)
>> [1] 3.4 3.4 … 3.5 3.5
*** 3.46부터 3.5로 round(IEEE 기준)
round(seq(1.1, 1.19, 0.01), 1)
>> [1] 1.1 1.1 … 1.2 1.2
***이진수 연산에 따른 오차로 5번째부터 올림되기도, 6번째부터 올림되기도..
5) 함수 which
x <- 10:1
x == 4
>> [1] FALSE …TRUE FALSE FALSE FALSE
which(x == 4)
>> [1] 7
which(x > 3 & x < 6)
>> [1] 6 7
x[x > 3 & x < 6]
>> [1] 5 4
x[which(x > 3 & x < 6)]
>> [1] 5 4
6) 함수 length
length(letters)
>> [1] 26
length(which(letters==”a”|letters=”b”))
>> [1] 2
length(letters==’a’|letters==’b’)
>> [1] 26
length(letters!=’a’&letters!=’b’)
>> [1] 26
7) 함수 sum
x <- 10:1
sum(x)
>> [1] 55
sum(x == 4)
>> [1] 1
sum(x > 8 | x < 3)
>> [1] 4 *** 개수 출력
8) 함수 table
table(x)
>> x
1 2 3 4 5 6 7 8 9 10
1 1 1 1 1 1 1 1 1 1
class(table(x))
>> [1] “table”
table(x == 4)
>> FALSE TRUE
9 1
table(x > 8 | x < 3)
>> FALSE TRUE
6 4
9) 값 편집
x <- 10:1
x[which(x > 8)] <- NA
x
>> [1] NA NA 8 7 6 5 4 3 2 1
x[x < 3] <- NA
x
>> [1] NA NA 8 7 6 5 4 3 NA NA
10) Value Matching
x <- c(“a”, “b”, “c”, “d”)
y <- c(“g”, “x”, “d”, “e”, “f”, “a”, “c”)
match(x, y)
>> [1] 6 NA 7 3
x %in% y
>> [1] TRUE FALSE TRUE TRUE
x <- c(“a”, “b”, “c”, “d”)
y <- c(“g”, “a”, “d”, “e”, “c”, “a”, “c”)
match(x, y)
>> [1] 2 NA 5 3
x %in% y
>> [1] TRUE FALSE TRUE TRUE
which(y %in% x)
>> [1] 2 3 5 6 7
11) 집합론 함수
unique(x)
>> [1] “a” “b” “c” “d”
unique(y)
>> [1] “g” “a” “d” “e” “c”
union(x, y)
>> [1] “a” “b” “c” “d” “g” “e”
union(y, x)
>> [1] “g” “a” “d” “e” “c” “b”
intersect(x, y)
>> [1] “a” “c” “d”
intersect(y, x)
>> [1] “a” “d” “c”
setdiff(x, y)
>> [1] “b”
setdiff(y, x)
>> [1] “g” “e”
x <- 1:10
any(x > 8)
>> [1] TRUE
any(x > 10)
>> [1] FALSE
all(x > 8)
>> [1] FALSE
all(x > 0)
>> [1] TRUE
12) 벡터 정렬
x <- c(“a”, “b”, “c”, “d”)
y <- c(“g”, “a”, “d”, “e”, “c”, “a”, “c”)
sort(x)
>> [1] “a” “b” “c” “d”
sort(x, decreasing = T)
>> [1] “d” “c” “b” “a”
sort(y)
>> [1] “a” “a” “c” “c” “d” “e” “g”
order(x)
>> [1] 1 2 3 4
order(x, decreasing = T)
>> [1] 4 3 2 1
order(y)
>> [1] 2 6 5 7 3 4 1
order(y, decreasing = T)
>> [1] 1 4 3 5 7 2 6
5. 텍스트파일 불러오기
1) 클립보드에서 불러오기
텍스트 파일을 메모장 등에서 열고 Ctrl + A(전체선택), Ctrl + C 후 입력
TEXT <- scan(file = ‘clipboard’, what = ‘char’, quote = NULL)
2) 파일명으로 불러오기
파일 – 작업디렉토리 변경 – File – Change Dir 메뉴에서 선택
TEXT <- scan(file = ‘03_WhatIsR.txt’, what = ‘char’, quote = NULL)
3) 파일 열기/선택창에서 파일 선택하기
TEXT <- scan(file = file.choose(), what = ‘char’, quote = NULL)
>> Read 486 items
6. 불러온 데이터 출력
1) 벡터 처음/마지막 원소 보기
head(TEXT), tail(TEXT)
>> 6개씩 출력
head(TEXT, 10)
tail(TEXT, 10)
2) 조건을 이용한 검색 및 추출
TEXT[TEXT == ‘a’]
>> [1] “a” “a” “a” …
length(TEXT[TEXT == “the”])
>> [1] 14
3) 값의 편집
TEXT[TEXT == ‘an’] <- ‘a’
TEXT[TEXT == ‘an’]
>> character(0)
4) 벡터파일 저장
cat(TEXT, file = “vector.txt”, sep = ‘\n’)
'R Programming > Notes' 카테고리의 다른 글
R 프로그래밍(3) - Factor & DataFrame (0) | 2021.02.18 |
---|---|
R 프로그래밍(1) - 코퍼스 언어학이란? (1) | 2021.02.15 |