데이터 엔지니어

Operating System/Docker

[Windows11] How to install vscode in windows11

https://code.visualstudio.com/download Download Visual Studio Code - Mac, Linux, Windows Visual Studio Code is free and available on your favorite platform - Linux, macOS, and Windows. Download Visual Studio Code to experience a redefined code editor, optimized for building and debugging modern web and cloud applications. code.visualstudio.com

Operating System/Docker

[Docker] Error - WSL 2 installation is incomplete.

Run as administrator dism.exe /online /enable-feature /featurename:Microsoft-Windows-Subsystem-Linux /all /norestart wslstorestorage.blob.core.windows.net/wslblob/wsl_update_x64.msi

Operating System/Docker

[Docker] How to install docker desktop in windows11

https://www.docker.com/get-started/ Developers - Docker Developer productivity tools and a local Kubernetes environment. www.docker.com Click Download for Windows

Data Engineering/Spark

[Spark] To set up memory in a spark session

Code from pyspark.sql import SparkSession spark = SparkSession.builder \ .appName('3_test_sparksession') \ .master('spark://spark-master:17077') \ .config('spark.driver.cores', '1') \ .config('spark.driver.memory','1g') \ .config('spark.executor.memory', '1g') \ .config('spark.executor.cores', '2') \ .config('spark.cores.max', '2') \ .getOrCreate() sc = spark.sparkContext for setting in sc._conf..

Data Engineering/Spark

[Spark] How to use the Global Temporary View

Code from pyspark.sql import SparkSession from pyspark.sql import Row spark = SparkSession.builder \ .appName("1_test_dataframe") \ .master('spark://spark-master:17077') \ .getOrCreate() sc = spark.sparkContext data = [Row(id = 0, name = 'a', age = 12, type = 'A', score = 90, year = 2012), Row(id = 1, name = 'a', age = 15, type = 'B', score = 80, year = 2013), Row(id = 2, name = 'b', age = 15, t..

Data Engineering/Spark

[Spark] Basic way to use spark data frames with sql query

Code from pyspark.sql import SparkSession from pyspark.sql import Row spark = SparkSession.builder \ .appName("1_test_dataframe") \ .master('spark://spark-master:17077') \ .getOrCreate() sc = spark.sparkContext data = [Row(id = 0, name = 'a', age = 12, type = 'A', score = 90, year = 2012), Row(id = 1, name = 'a', age = 15, type = 'B', score = 80, year = 2013), Row(id = 2, name = 'b', age = 15, t..

Data Engineering/Spark

[Spark] Compare user settings in pyspark code

Code from pyspark.conf import SparkConf from pyspark.context import SparkContext conf = SparkConf().setAll([('spark.app.name', '2_test_sparkconf'), ('spark.master', 'spark://spark-master:17077')]) sc = SparkContext(conf = conf) print('first') for setting in sc._conf.getAll(): print(setting) sc.stop() conf = SparkConf().setAll([('spark.app.name', '2_test_sparkconf'), ('spark.master', 'spark://spa..

Data Engineering/Spark

[Spark] To output the default settings for a Spark Session

Code from pyspark.sql import SparkSession spark = SparkSession.builder \ .appName('3_test_sparksession') \ .master('spark://spark-master:17077') \ .getOrCreate() sc = spark.sparkContext for setting in sc._conf.getAll(): print(setting) sc.stop() Result ('spark.driver.port', '39007') ('spark.master', 'spark://spark-master:17077') ('spark.sql.warehouse.dir', 'file:/home/spark/dev/spark-warehouse') ..

Data Engineering/Spark

[Spark] To output spark settings from pyspark code

Code from pyspark.conf import SparkConf from pyspark.context import SparkContext conf = SparkConf().setAll([('spark.app.name', '2_test_sparksession'), ('spark.master', 'spark://spark-master:17077'), ('spark.driver.cores', '1'), ('spark.driver.memory','1g'), ('spark.executor.memory', '1g'), ('spark.executor.cores', '2'), ('spark.cores.max', '2')]) sc = SparkContext(conf = conf) for setting in sc...

Data Engineering/Spark

[Spark] How to adjust spark memory in pyspark code

Code from pyspark.conf import SparkConf from pyspark.context import SparkContext conf = SparkConf().setAll([('spark.app.name', '2_test_sparksession'), ('spark.master', 'spark://spark-master:17077'), ('spark.driver.cores', '1'), ('spark.driver.memory','1g'), ('spark.executor.memory', '1g'), ('spark.executor.cores', '1'), ('spark.cores.max', '2')]) sc = SparkContext(conf = conf) sc.stop() Result C..

박경태
'분류 전체보기' 카테고리의 글 목록 (63 Page)