Data Solution 2019(13)Docker Zeppelin Notebook and Memory Configuration
程序员文章站
2022-03-30 19:23:46
...
Data Solution 2019(13)Docker Zeppelin Notebook and Memory Configuration
On my MAC, I run into this error when I build my docker image
Disk Requirements:
At least 187MB more space needed on the / filesystem.
I check my disk space, I do have disk on MAC. So maybe it caused by I build too many docker images on my MAC, so here is the command to clean up them
Remove all the containers
> docker rm $(docker ps -qa)
Remove all the images
> docker rmi $(docker image ls -qa)
Memory and Cores Settings
Partitions: split the large data
Task: run in one single Executor. All tasks can be parallel.
Executor: JVM in one worker node, one node can run multiple executors
Cores:
Cluster Manager:
Driver: SparkContext connect tot he cluster manager ( Standalone )
Cluster Manager: manage all resources, like executors
Spark get all executors, send our packages/codes to all executor
SparkContext send all tasks to executors
Core: number of parallel per executor, eg 5
Executors: number of executers, CPU cores/ 5 = num
Memory: Memory / Executors
Executor Total Memory = ExecutorMemory + MemoryOverhead
MemoryOverhead = max( 384M, 0.07 x spark.executor.memory)
Finally, I made it working with ZeppelinBook, Spark Master, Spark Slaves. For example
192.168.56.110 rancher-home Zeppelin Book, Spark Master
192.168.56.111 rancher-worker1 Spark Slave
192.168.56.112 rancher-worker2. Spark Slave
Spark Master on rancher-home
Dockerfile including R and Python ENV
#Set up spark master in Docker
#Prepre the OS
FROM centos:7
MAINTAINER Yiyi Kang <yiyikangrachel@gmail.com>
RUN yum -y update
RUN yum install -y wget
#install java
RUN yum -y install java-1.8.0-openjdk.x86_64
RUN echo ‘export JAVA_HOME=/usr/lib/jvm/jre-1.8.0-openjdk’ | tee -a /etc/profile
#prepare python
RUN yum groupinstall -y "Development tools"
RUN yum -y install git freetype-devel openssl-devel libffi-devel
RUN git clone https://github.com/pyenv/pyenv.git ~/.pyenv
ENV HOME /root
ENV PYENV_ROOT $HOME/.pyenv
ENV PATH $PYENV_ROOT/shims:$PYENV_ROOT/bin:$PATH
RUN pyenv install 3.7.5
RUN pyenv global 3.7.5
#prepare R
RUN yum install -y epel-release
RUN yum install -y R
RUN mkdir /tool/
WORKDIR /tool/
#add the software spark
RUN wget --no-verbose http://apache.mirrors.ionfish.org/spark/spark-2.4.4/spark-2.4.4-bin-hadoop2.7.tgz
RUN tar -xvzf spark-2.4.4-bin-hadoop2.7.tgz
RUN ln -s /tool/spark-2.4.4-bin-hadoop2.7 /tool/spark
ADD conf/spark-env.sh /tool/spark/conf/
#python libraries
RUN pip install --upgrade pip
RUN pip install pandas
RUN pip install -U pandasql
RUN pip install matplotlib
#R libraries
RUN R -e "install.packages('data.table',dependencies=TRUE, repos='http://cran.rstudio.com/')"
RUN R -e "install.packages('knitr',dependencies=TRUE, repos='http://cran.rstudio.com/')"
RUN R -e "install.packages('googleVis',dependencies=TRUE, repos='http://cran.rstudio.com/')"
#set up the app
EXPOSE 8088 7077
RUN mkdir -p /app/
ADD start.sh /app/
WORKDIR /app/
CMD [ "./start.sh” ]
Makefile to support memory parameter and hostname parameter
HOSTNAME=rancher-home
MEMORY=2g
IMAGE=sillycat/public
TAG=sillycat-sparkmaster-1.0
NAME=sillycat-sparkmaster-1.0
docker-context:
build: docker-context
docker build -t $(IMAGE):$(TAG) .
run:
docker run -d \
-e "SPARK_LOCAL_HOSTNAME=$(HOSTNAME)" \
-e "SPARK_IDENT_STRING=$(HOSTNAME)" \
-e "SPARK_PUBLIC_DNS=$(HOSTNAME)" \
-e "SPARK_DAEMON_MEMORY=$(MEMORY)" \
--network host \
--name $(NAME) $(IMAGE):$(TAG)
clean:
docker stop ${NAME}
docker rm ${NAME}
logs:
docker logs ${NAME}
publish:
docker push ${IMAGE}
start.sh to start the Spark master
#!/bin/sh -ex
#prepare ENV
#start the service
cd /tool/spark
sbin/start-master.sh
Settings in conf/spark-env.sh to support the port number
SPARK_MASTER_PORT=7077
SPARK_MASTER_WEBUI_PORT=8088
SPARK_NO_DAEMONIZE=true
I use this command to start the container
>make run HOSTNAME=rancher-home MEMROY=1g
Zeppelin on the rancher-home machine
Dockerfile containers all the libraries and softwares
#Set up Zeppelin Notebook
#Prepre the OS
FROM centos:7
MAINTAINER Yiyi Kang <yiyikangrachel@gmail.com>
RUN yum -y update
RUN yum install -y wget
#java
RUN yum -y install java-1.8.0-openjdk.x86_64
RUN echo ‘export JAVA_HOME=/usr/lib/jvm/jre-1.8.0-openjdk’ | tee -a /etc/profile
#prepare python
RUN yum groupinstall -y "Development tools"
RUN yum -y install git freetype-devel openssl-devel libffi-devel
RUN git clone https://github.com/pyenv/pyenv.git ~/.pyenv
ENV HOME /root
ENV PYENV_ROOT $HOME/.pyenv
ENV PATH $PYENV_ROOT/shims:$PYENV_ROOT/bin:$PATH
RUN pyenv install 3.7.5
RUN pyenv global 3.7.5
#prepare R
RUN yum install -y epel-release
RUN yum install -y R
RUN mkdir /tool/
WORKDIR /tool/
#add the software spark
RUN wget --no-verbose http://www.gtlib.gatech.edu/pub/apache/zeppelin/zeppelin-0.8.2/zeppelin-0.8.2-bin-all.tgz
RUN tar -xvzf zeppelin-0.8.2-bin-all.tgz
RUN ln -s /tool/zeppelin-0.8.2-bin-all /tool/zeppelin
#add the software spark
RUN wget --no-verbose http://apache.mirrors.ionfish.org/spark/spark-2.4.4/spark-2.4.4-bin-hadoop2.7.tgz
RUN tar -xvzf spark-2.4.4-bin-hadoop2.7.tgz
RUN ln -s /tool/spark-2.4.4-bin-hadoop2.7 /tool/spark
#python libraries
RUN pip install --upgrade pip
RUN pip install pandas
RUN pip install -U pandasql
RUN pip install matplotlib
#R libraries
RUN R -e "install.packages('data.table',dependencies=TRUE, repos='http://cran.rstudio.com/')"
RUN R -e "install.packages('knitr',dependencies=TRUE, repos='http://cran.rstudio.com/')"
RUN R -e "install.packages('googleVis',dependencies=TRUE, repos='http://cran.rstudio.com/')"
#set up the app
EXPOSE 8080 4040
RUN mkdir -p /app/
ADD start.sh /app/
WORKDIR /app/
CMD [ "./start.sh" ]
Makefile to start the container with host network
IMAGE=sillycat/public
TAG=sillycat-zeppelinbook-1.0
NAME=sillycat-zeppelinbook-1.0
docker-context:
build: docker-context
docker build -t $(IMAGE):$(TAG) .
run:
docker run -d --privileged=true \
-v $(shell pwd)/zeppelin/notebook:/tool/zeppelin/notebook \
-v $(shell pwd)/zeppelin/conf:/tool/zeppelin/conf \
--network host \
--name $(NAME) \
$(IMAGE):$(TAG)
clean:
docker stop ${NAME}
docker rm ${NAME}
logs:
docker logs ${NAME}
publish:
docker push ${IMAGE}
Script to start.sh
#!/bin/sh -ex
#start the service
cd /tool/zeppelin
bin/zeppelin.sh
Settings in zeppelin/conf/zeppelin-env.sh
export SPARK_HOME=/tool/spark
export MASTER=spark://rancher-home:7077
Very important thing is this - How to add Dependencies
In the interpreter settings
Add Dependencies in
Artifact: mysql:mysql-connector-java:5.1.47
That is only for driver and notebook, but we need add that this as well to make it working on all the slaves.
spark.jars.packages: mysql:mysql-connector-java:5.1.47
The Spark Slave will be similar to Master
Dockerfile
#Set up spark slave in Docker
#Prepre the OS
FROM centos:7
MAINTAINER Yiyi Kang <yiyikangrachel@gmail.com>
RUN yum -y update
RUN yum install -y wget
#install jdk
RUN yum -y install java-1.8.0-openjdk.x86_64
RUN echo ‘export JAVA_HOME=/usr/lib/jvm/jre-1.8.0-openjdk’ | tee -a /etc/profile
#prepare python
RUN yum groupinstall -y "Development tools"
RUN yum -y install git freetype-devel openssl-devel libffi-devel
RUN git clone https://github.com/pyenv/pyenv.git ~/.pyenv
ENV HOME /root
ENV PYENV_ROOT $HOME/.pyenv
ENV PATH $PYENV_ROOT/shims:$PYENV_ROOT/bin:$PATH
RUN pyenv install 3.7.5
RUN pyenv global 3.7.5
#prepare R
RUN yum install -y epel-release
RUN yum install -y R
RUN mkdir /tool/
WORKDIR /tool/
#add the software spark
RUN wget --no-verbose http://apache.mirrors.ionfish.org/spark/spark-2.4.4/spark-2.4.4-bin-hadoop2.7.tgz
RUN tar -xvzf spark-2.4.4-bin-hadoop2.7.tgz
RUN ln -s /tool/spark-2.4.4-bin-hadoop2.7 /tool/spark
ADD conf/spark-env.sh /tool/spark/conf/
#python libraries
RUN pip install --upgrade pip
RUN pip install pandas
RUN pip install -U pandasql
RUN pip install matplotlib
#r libraries
RUN R -e "install.packages('data.table',dependencies=TRUE, repos='http://cran.rstudio.com/')"
RUN R -e "install.packages('knitr',dependencies=TRUE, repos='http://cran.rstudio.com/')"
RUN R -e "install.packages('googleVis',dependencies=TRUE, repos='http://cran.rstudio.com/')"
#set up the app
EXPOSE 8188 7177
RUN mkdir -p /app/
ADD start.sh /app/
WORKDIR /app/
CMD [ "./start.sh” ]
Makefile need to connect to master machine
HOSTNAME=rancher-worker1
MASTER=rancher-home
SPARK_WORKER_CORES=2
SPARK_WORKER_MEMORY=2g
IMAGE=sillycat/public
TAG=sillycat-sparkslave-1.0
NAME=sillycat-sparkslave-1.0
docker-context:
build: docker-context
docker build -t $(IMAGE):$(TAG) .
run:
docker run -d \
-e "SPARK_PUBLIC_DNS=$(HOSTNAME)" \
-e "SPARK_LOCAL_HOSTNAME=$(HOSTNAME)" \
-e "SPARK_IDENT_STRING=$(HOSTNAME)" \
-e "SPARK_MASTER=$(MASTER)" \
-e "SPARK_WORKER_CORES=$(SPARK_WORKER_CORES)" \
-e "SPARK_WORKER_MEMORY=$(SPARK_WORKER_MEMORY)" \
--name $(NAME) \
--network host \
$(IMAGE):$(TAG)
clean:
docker stop ${NAME}
docker rm ${NAME}
logs:
docker logs ${NAME}
publish:
docker push ${IMAGE}
Shell script to start.sh
#!/bin/sh -ex
#start the service
cd /tool/spark
sbin/start-slave.sh spark://${SPARK_MASTER}:7077
Settings in conf/spark-env.sh
SPARK_WORKER_PORT=7177
SPARK_WORKER_WEBUI_PORT=8188
SPARK_IDENT_STRING=rancher-worker1
SPARK_NO_DAEMONIZE=true
Command to start will be similar to this
>make run MASTER=rancher-home HOSTNAME=rancher-worker1 SPARK_WORKER_CORES=2 SPARK_WORKER_MEMORY=2g
References:
https://*.com/questions/38820979/docker-image-error-downloading-package
Memory
https://www.jianshu.com/p/a8b61f14309f
https://blog.51cto.com/10120275/2364992
https://taoistwar.gitbooks.io/spark-operationand-maintenance-management/content/spark_install/spark_standalone_configuration.html
Zeppelin Login Issue
https://*.com/questions/46685400/login-to-zeppelin-issues-with-docker
Zeppelin Dependencies Issue
http://zeppelin.apache.org/docs/0.8.2/interpreter/spark.html#dependencyloading
On my MAC, I run into this error when I build my docker image
Disk Requirements:
At least 187MB more space needed on the / filesystem.
I check my disk space, I do have disk on MAC. So maybe it caused by I build too many docker images on my MAC, so here is the command to clean up them
Remove all the containers
> docker rm $(docker ps -qa)
Remove all the images
> docker rmi $(docker image ls -qa)
Memory and Cores Settings
Partitions: split the large data
Task: run in one single Executor. All tasks can be parallel.
Executor: JVM in one worker node, one node can run multiple executors
Cores:
Cluster Manager:
Driver: SparkContext connect tot he cluster manager ( Standalone )
Cluster Manager: manage all resources, like executors
Spark get all executors, send our packages/codes to all executor
SparkContext send all tasks to executors
Core: number of parallel per executor, eg 5
Executors: number of executers, CPU cores/ 5 = num
Memory: Memory / Executors
Executor Total Memory = ExecutorMemory + MemoryOverhead
MemoryOverhead = max( 384M, 0.07 x spark.executor.memory)
Finally, I made it working with ZeppelinBook, Spark Master, Spark Slaves. For example
192.168.56.110 rancher-home Zeppelin Book, Spark Master
192.168.56.111 rancher-worker1 Spark Slave
192.168.56.112 rancher-worker2. Spark Slave
Spark Master on rancher-home
Dockerfile including R and Python ENV
#Set up spark master in Docker
#Prepre the OS
FROM centos:7
MAINTAINER Yiyi Kang <yiyikangrachel@gmail.com>
RUN yum -y update
RUN yum install -y wget
#install java
RUN yum -y install java-1.8.0-openjdk.x86_64
RUN echo ‘export JAVA_HOME=/usr/lib/jvm/jre-1.8.0-openjdk’ | tee -a /etc/profile
#prepare python
RUN yum groupinstall -y "Development tools"
RUN yum -y install git freetype-devel openssl-devel libffi-devel
RUN git clone https://github.com/pyenv/pyenv.git ~/.pyenv
ENV HOME /root
ENV PYENV_ROOT $HOME/.pyenv
ENV PATH $PYENV_ROOT/shims:$PYENV_ROOT/bin:$PATH
RUN pyenv install 3.7.5
RUN pyenv global 3.7.5
#prepare R
RUN yum install -y epel-release
RUN yum install -y R
RUN mkdir /tool/
WORKDIR /tool/
#add the software spark
RUN wget --no-verbose http://apache.mirrors.ionfish.org/spark/spark-2.4.4/spark-2.4.4-bin-hadoop2.7.tgz
RUN tar -xvzf spark-2.4.4-bin-hadoop2.7.tgz
RUN ln -s /tool/spark-2.4.4-bin-hadoop2.7 /tool/spark
ADD conf/spark-env.sh /tool/spark/conf/
#python libraries
RUN pip install --upgrade pip
RUN pip install pandas
RUN pip install -U pandasql
RUN pip install matplotlib
#R libraries
RUN R -e "install.packages('data.table',dependencies=TRUE, repos='http://cran.rstudio.com/')"
RUN R -e "install.packages('knitr',dependencies=TRUE, repos='http://cran.rstudio.com/')"
RUN R -e "install.packages('googleVis',dependencies=TRUE, repos='http://cran.rstudio.com/')"
#set up the app
EXPOSE 8088 7077
RUN mkdir -p /app/
ADD start.sh /app/
WORKDIR /app/
CMD [ "./start.sh” ]
Makefile to support memory parameter and hostname parameter
HOSTNAME=rancher-home
MEMORY=2g
IMAGE=sillycat/public
TAG=sillycat-sparkmaster-1.0
NAME=sillycat-sparkmaster-1.0
docker-context:
build: docker-context
docker build -t $(IMAGE):$(TAG) .
run:
docker run -d \
-e "SPARK_LOCAL_HOSTNAME=$(HOSTNAME)" \
-e "SPARK_IDENT_STRING=$(HOSTNAME)" \
-e "SPARK_PUBLIC_DNS=$(HOSTNAME)" \
-e "SPARK_DAEMON_MEMORY=$(MEMORY)" \
--network host \
--name $(NAME) $(IMAGE):$(TAG)
clean:
docker stop ${NAME}
docker rm ${NAME}
logs:
docker logs ${NAME}
publish:
docker push ${IMAGE}
start.sh to start the Spark master
#!/bin/sh -ex
#prepare ENV
#start the service
cd /tool/spark
sbin/start-master.sh
Settings in conf/spark-env.sh to support the port number
SPARK_MASTER_PORT=7077
SPARK_MASTER_WEBUI_PORT=8088
SPARK_NO_DAEMONIZE=true
I use this command to start the container
>make run HOSTNAME=rancher-home MEMROY=1g
Zeppelin on the rancher-home machine
Dockerfile containers all the libraries and softwares
#Set up Zeppelin Notebook
#Prepre the OS
FROM centos:7
MAINTAINER Yiyi Kang <yiyikangrachel@gmail.com>
RUN yum -y update
RUN yum install -y wget
#java
RUN yum -y install java-1.8.0-openjdk.x86_64
RUN echo ‘export JAVA_HOME=/usr/lib/jvm/jre-1.8.0-openjdk’ | tee -a /etc/profile
#prepare python
RUN yum groupinstall -y "Development tools"
RUN yum -y install git freetype-devel openssl-devel libffi-devel
RUN git clone https://github.com/pyenv/pyenv.git ~/.pyenv
ENV HOME /root
ENV PYENV_ROOT $HOME/.pyenv
ENV PATH $PYENV_ROOT/shims:$PYENV_ROOT/bin:$PATH
RUN pyenv install 3.7.5
RUN pyenv global 3.7.5
#prepare R
RUN yum install -y epel-release
RUN yum install -y R
RUN mkdir /tool/
WORKDIR /tool/
#add the software spark
RUN wget --no-verbose http://www.gtlib.gatech.edu/pub/apache/zeppelin/zeppelin-0.8.2/zeppelin-0.8.2-bin-all.tgz
RUN tar -xvzf zeppelin-0.8.2-bin-all.tgz
RUN ln -s /tool/zeppelin-0.8.2-bin-all /tool/zeppelin
#add the software spark
RUN wget --no-verbose http://apache.mirrors.ionfish.org/spark/spark-2.4.4/spark-2.4.4-bin-hadoop2.7.tgz
RUN tar -xvzf spark-2.4.4-bin-hadoop2.7.tgz
RUN ln -s /tool/spark-2.4.4-bin-hadoop2.7 /tool/spark
#python libraries
RUN pip install --upgrade pip
RUN pip install pandas
RUN pip install -U pandasql
RUN pip install matplotlib
#R libraries
RUN R -e "install.packages('data.table',dependencies=TRUE, repos='http://cran.rstudio.com/')"
RUN R -e "install.packages('knitr',dependencies=TRUE, repos='http://cran.rstudio.com/')"
RUN R -e "install.packages('googleVis',dependencies=TRUE, repos='http://cran.rstudio.com/')"
#set up the app
EXPOSE 8080 4040
RUN mkdir -p /app/
ADD start.sh /app/
WORKDIR /app/
CMD [ "./start.sh" ]
Makefile to start the container with host network
IMAGE=sillycat/public
TAG=sillycat-zeppelinbook-1.0
NAME=sillycat-zeppelinbook-1.0
docker-context:
build: docker-context
docker build -t $(IMAGE):$(TAG) .
run:
docker run -d --privileged=true \
-v $(shell pwd)/zeppelin/notebook:/tool/zeppelin/notebook \
-v $(shell pwd)/zeppelin/conf:/tool/zeppelin/conf \
--network host \
--name $(NAME) \
$(IMAGE):$(TAG)
clean:
docker stop ${NAME}
docker rm ${NAME}
logs:
docker logs ${NAME}
publish:
docker push ${IMAGE}
Script to start.sh
#!/bin/sh -ex
#start the service
cd /tool/zeppelin
bin/zeppelin.sh
Settings in zeppelin/conf/zeppelin-env.sh
export SPARK_HOME=/tool/spark
export MASTER=spark://rancher-home:7077
Very important thing is this - How to add Dependencies
In the interpreter settings
Add Dependencies in
Artifact: mysql:mysql-connector-java:5.1.47
That is only for driver and notebook, but we need add that this as well to make it working on all the slaves.
spark.jars.packages: mysql:mysql-connector-java:5.1.47
The Spark Slave will be similar to Master
Dockerfile
#Set up spark slave in Docker
#Prepre the OS
FROM centos:7
MAINTAINER Yiyi Kang <yiyikangrachel@gmail.com>
RUN yum -y update
RUN yum install -y wget
#install jdk
RUN yum -y install java-1.8.0-openjdk.x86_64
RUN echo ‘export JAVA_HOME=/usr/lib/jvm/jre-1.8.0-openjdk’ | tee -a /etc/profile
#prepare python
RUN yum groupinstall -y "Development tools"
RUN yum -y install git freetype-devel openssl-devel libffi-devel
RUN git clone https://github.com/pyenv/pyenv.git ~/.pyenv
ENV HOME /root
ENV PYENV_ROOT $HOME/.pyenv
ENV PATH $PYENV_ROOT/shims:$PYENV_ROOT/bin:$PATH
RUN pyenv install 3.7.5
RUN pyenv global 3.7.5
#prepare R
RUN yum install -y epel-release
RUN yum install -y R
RUN mkdir /tool/
WORKDIR /tool/
#add the software spark
RUN wget --no-verbose http://apache.mirrors.ionfish.org/spark/spark-2.4.4/spark-2.4.4-bin-hadoop2.7.tgz
RUN tar -xvzf spark-2.4.4-bin-hadoop2.7.tgz
RUN ln -s /tool/spark-2.4.4-bin-hadoop2.7 /tool/spark
ADD conf/spark-env.sh /tool/spark/conf/
#python libraries
RUN pip install --upgrade pip
RUN pip install pandas
RUN pip install -U pandasql
RUN pip install matplotlib
#r libraries
RUN R -e "install.packages('data.table',dependencies=TRUE, repos='http://cran.rstudio.com/')"
RUN R -e "install.packages('knitr',dependencies=TRUE, repos='http://cran.rstudio.com/')"
RUN R -e "install.packages('googleVis',dependencies=TRUE, repos='http://cran.rstudio.com/')"
#set up the app
EXPOSE 8188 7177
RUN mkdir -p /app/
ADD start.sh /app/
WORKDIR /app/
CMD [ "./start.sh” ]
Makefile need to connect to master machine
HOSTNAME=rancher-worker1
MASTER=rancher-home
SPARK_WORKER_CORES=2
SPARK_WORKER_MEMORY=2g
IMAGE=sillycat/public
TAG=sillycat-sparkslave-1.0
NAME=sillycat-sparkslave-1.0
docker-context:
build: docker-context
docker build -t $(IMAGE):$(TAG) .
run:
docker run -d \
-e "SPARK_PUBLIC_DNS=$(HOSTNAME)" \
-e "SPARK_LOCAL_HOSTNAME=$(HOSTNAME)" \
-e "SPARK_IDENT_STRING=$(HOSTNAME)" \
-e "SPARK_MASTER=$(MASTER)" \
-e "SPARK_WORKER_CORES=$(SPARK_WORKER_CORES)" \
-e "SPARK_WORKER_MEMORY=$(SPARK_WORKER_MEMORY)" \
--name $(NAME) \
--network host \
$(IMAGE):$(TAG)
clean:
docker stop ${NAME}
docker rm ${NAME}
logs:
docker logs ${NAME}
publish:
docker push ${IMAGE}
Shell script to start.sh
#!/bin/sh -ex
#start the service
cd /tool/spark
sbin/start-slave.sh spark://${SPARK_MASTER}:7077
Settings in conf/spark-env.sh
SPARK_WORKER_PORT=7177
SPARK_WORKER_WEBUI_PORT=8188
SPARK_IDENT_STRING=rancher-worker1
SPARK_NO_DAEMONIZE=true
Command to start will be similar to this
>make run MASTER=rancher-home HOSTNAME=rancher-worker1 SPARK_WORKER_CORES=2 SPARK_WORKER_MEMORY=2g
References:
https://*.com/questions/38820979/docker-image-error-downloading-package
Memory
https://www.jianshu.com/p/a8b61f14309f
https://blog.51cto.com/10120275/2364992
https://taoistwar.gitbooks.io/spark-operationand-maintenance-management/content/spark_install/spark_standalone_configuration.html
Zeppelin Login Issue
https://*.com/questions/46685400/login-to-zeppelin-issues-with-docker
Zeppelin Dependencies Issue
http://zeppelin.apache.org/docs/0.8.2/interpreter/spark.html#dependencyloading