데이터 엔지니어

Programming Language/Python

[Python] 파이썬으로 오늘부터 일주일 전까지의 주가지수 구하는 방법

''' pip install pandas,beautifulsoup4,finance-datareader -y ''' import FinanceDataReader as fdr from datetime import datetime, timedelta if __name__ == '__main__': now_datetime = datetime.now() before_one_week = (now_datetime - timedelta(weeks = 1)).strftime('%Y-%m-%d') df_ks11 = fdr.DataReader(symbol = 'KS11', start = before_one_week) print(df_ks11)

Programming Language/Python

[Python] 파이썬으로 주식 코스닥만 출력하는 방법

''' pip install pandas,beautifulsoup4,finance-datareader -y ''' import FinanceDataReader as fdr def get_kosdaq_dataframe(): df = fdr.StockListing('KRX') df = df[df['Market'] == 'KOSDAQ'] return df if __name__ == '__main__': df_kosdaq = get_kosdaq_dataframe() print(df_kosdaq.head(20))

Programming Language/Python

[Python] 파이썬으로 주식 코스피만 출력하는 방법

''' pip install pandas,beautifulsoup4,finance-datareader -y ''' import FinanceDataReader as fdr def get_kospi_dataframe(): df = fdr.StockListing('KRX') df = df[df['Market'] == 'KOSPI'] return df if __name__ == '__main__': df_kospi = get_kospi_dataframe() print(df_kospi.head(20))

Programming Language/Python

[Python] 파이썬으로 특정 종목의 주식가격 출력하는 방법

''' pip install pandas,beautifulsoup4,finance-datareader -y ''' import FinanceDataReader as fdr def search_code(name): df = fdr.StockListing('KRX') code = df[df['Name'] == name]['Code'].to_string(index = False) return code if __name__ == '__main__': code = search_code('LG에너지솔루션') df = fdr.DataReader(symbol = code, start = '2023-01-10', end = '2023-01-20') print(df)

Programming Language/Python

[Python] 파이썬으로 주식 종목코드 확인하는 방법

''' pip install pandas,beautifulsoup4,finance-datareader -y ''' import FinanceDataReader as fdr def search_code(name): df = fdr.StockListing('KRX') code = df[df['Name'] == name]['Code'].to_string(index = False) return code if __name__ == '__main__': code_samsung = search_code('삼성전자') print(code_samsung, type(code_samsung)) code_sk = search_code('SK하이닉스') print(code_sk, type(code_sk))

Data Engineering/Spark

[Spark] conda로 pyspark 환경 구축하기

아나콘다 가상환경 리스트 확인 conda env list python 가상환경 만들기 conda create -n py38 python==3.8 -y 가상환경 리스트 확인 conda env list 가상환경 접속 후 리스트 확인 conda activate py38 conda env list pyspark 3.3.1 설치 conda install -c conda-forge pyspark==3.3.1 -y 라이브러리 리스트 확인 pip list 간단한 pyspark 코드 실행해보기 from pyspark.sql import SparkSession from pyspark.sql.types import StructType, StructField, StringType, IntegerType spark = Spar..

Data Engineering/Spark

[Spark] ValueError: field score: This field is not nullable, but got None

다음과 같은 에러가 발생 Traceback (most recent call last): File "df_schema_null.py", line 23, in df = spark.createDataFrame(data = data, schema = schema) File "/Users/pgt0409/opt/anaconda3/envs/py38/lib/python3.8/site-packages/pyspark/sql/session.py", line 894, in createDataFrame return self._create_dataframe( File "/Users/pgt0409/opt/anaconda3/envs/py38/lib/python3.8/site-packages/pyspark/sql/session.py"..

Data Engineering/Spark

[Spark] pyspark dataframe 생성시 schema data type 설정 방법

from pyspark.sql import SparkSession from pyspark.sql.types import StructType, StructField, StringType, IntegerType spark = SparkSession \ .builder \ .master('local') \ .appName('my_pyspark_app') \ .getOrCreate() data = [ ('kim', 100), ('kim', 90), ('lee', 80), ('lee', 70), ('park', 60) ] schema = ['name', 'score'] df = spark.createDataFrame(data = data, schema = schema) df.printSchema() df.show..

Data Engineering/Spark

[Spark] pyspark dataframe 의 특정 열을 list로 만드는 방법

from pyspark.sql import SparkSession from pyspark.sql.types import StructType, StructField, StringType, IntegerType spark = SparkSession \ .builder \ .master('local') \ .appName('my_pyspark_app') \ .getOrCreate() data = [ ('kim', 100), ('kim', 90), ('lee', 80), ('lee', 70), ('park', 60) ] schema = StructType([ \ StructField('name', StringType(), True), \ StructField('score', IntegerType(), True)..

Data Engineering/Spark

[Spark] pyspark dataframe을 리스트로 만드는 가장 좋고 빠른 방법

+-------------------------------------------------------------+---------+-------------+ | Code | 100,000 | 100,000,000 | +-------------------------------------------------------------+---------+-------------+ | df.select("col_name").rdd.flatMap(lambda x: x).collect() | 0.4 | 55.3 | | list(df.select('col_name').toPandas()['col_name']) | 0.4 | 17.5 | | df.select('col_name').rdd.map(lambda row : ro..

박경태
'분류 전체보기' 카테고리의 글 목록 (35 Page)