Hosting web server on AWS¶
Latest updates¶
My AWS EC2 instance is somehow completely down and I have no clue how to fix it. I can’t even SSH to it. I also tried rebuild/ clone the environment, they all failed.
I decided to use docker container and AWS ECS (a subset of EC2) to host my web application because EC2 is so hard to manage:
I have to build the VM by SSH to it and install some dependencies and download some data. Now with docker container, I can just do it locally and then check if my modification looks correct or not.
If the server error happends again, I can just create a new and upload the docker contaniner without building the dependencies ever again.
My deployment is finally succedded, according to this tutorial: https://acloudguru.com/blog/engineering/deploying-a-containerized-flask-application-with-aws-ecs-and-docker
Note
The above link is old. AWS updated its UI, so those screenshot are misleading now. The steps are: Create Cluster X, create task definition, click Cluster X and select create task. Another important note, you have to update “Security Groups” for this task as: IP4, HTTP, TCP 80 0.0.0.0/0
The key is to directly use port 80 in app.py
if __name__ == "__main__":
app.run(host='0.0.0.0', port=80)
and in the Dockerfile
EXPOSE 80
The following tutorial didn’t work, mainly due to port 8050 issues. I don’t know if it is because AWS updated their protocol.
I found docker hub upload speed is slower than AWS ECR. But it is free. With hg19 genome, my docker size is almost 8GB. But I found it is fine as long as your docker size is below 20GB.
my commands¶
# create docker
docker build -t easy_prime .
# test docker container to identify bugs before publish
docker run -p 80:80 easy_prime
# upload docker container
docker tag easy_prime:latest liyc1989/easy_prime:latest
docker push liyc1989/easy_prime:latest
# set up AWS sever using browser
# Remember to add port mapping 80
AWS ECS change IP everytime¶
May need to use load balancer: https://www.youtube.com/watch?v=TsVO14-lqp0
Summary¶
You can get a free one-year account on: https://aws.amazon.com/free/
Currently, I’m not sure whether or not AWS will charge you if the web app has been over-used, like CPU time is above their “free zone”.
8/4/2020
It turns out that you can easily pass the free 750 hours (= 1 month) if you don’t know that you have servicea in more than one cluster. It finally charged me $4 in July and estimated to be $8 per month.
I then terminated all services and asked for refund.
8/20/2020
Seems that eb create
will create running instances in multiple areas, e.g., N.virginia and Orengan.
Usage¶
Step 1: register an AWS account¶
You can use the free one here: https://aws.amazon.com/free/
Go to: https://console.aws.amazon.com, and create a new access key
Step 2: install command line tools for AWS Elastic Beanstalk
¶
conda install -c contango awsebcli
## awsebcli conda is only available in win64, however, I successfully installed it in macOS, not sure why.
Step 3: Dash app toy example¶
Now, suppose you have a Dash app already and you want to deploy it to EB.
Ref: https://medium.com/@korniichuk/dash-on-aws-44a0f50a030a
Create a new folder, test
, and copy the following dash app and save it as application.py
. This is a keyword.
For other keywords, see http://www.zhengwenjie.net/beanstalk/
import dash
import dash_html_components as html
app = dash.Dash(__name__)
app.scripts.config.serve_locally = True
app.css.config.serve_locally = True
app.layout = html.Div([
html.H1('Hello, World!')
])
application = app.server
if __name__ == '__main__':
application.run(debug=True, port=8080)
Next,
Copy python dependencies and save it as requirements.txt
. Again, keywords.
dash==0.39.0
dash-daq==0.1.0
Then, open terminal, to go folder test
and type the following command:
eb init
# It may ask you to input id and password that you created in step1
# Do you want to set up SSH for your instances?
# (Y/n): Enter n
eb create
eb open
If you see Hello World, then congratulations!
Step 4: Upload your own Dash app¶
Basically, if you have finished step 3 then you should be able to upload any python programs.
I want to put my Easy-Prime tool up there and have encountered several problems. Here’s how I solved them.
I put all the dependencies in
requirements.txt
, I didn’t specify version because I think it could cause conficts.
dash
dash-daq
biopython
dash-bio
dash-html-components
joblib
matplotlib
numpy
pandas
plotly
plotly-express
PyYAML
scikit-image
scikit-learn
scipy
seaborn
I had a gcc problem and found a solution. First, create a folder called
.ebextensions
and a file inside it called,01_packages.config
.
packages:
yum:
gcc-c++: []
unixODBC-devel: []
python3-devel: []
The indent should be spaces, not tab.
I found using SSH is the easiest way install things.
eb ssh
will ssh to your instances in the current working dir, otherwise you can use eb ssh env_name
.
Your app is stored at /var/app/current
and your python is /var/app/venv/bin/python
By default, you can’t write in these dirs, so you need to add sudo
. I don’t know why they give you sudo option, but not directly writable.
sudo yum groupinstall "Development Tools"
https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/compile-software.html
Again, this is obviously necessary, but you have to install it yourself.
Default EB size is 8G, now if I put hg19.fa, it also used all the space and I got no space error. I have to increase the space in EC2. I don’t know if it will cause extra money.
To update your code on EB, use eb deploy
eb deploy
will remove every old code. If I have small changes, I will directly modify the code online. There should some git pull method.
To increase space, simply increase the volumn on the webpage will not work. Follow the method here: https://til.codes/extending-the-disk-space-on-an-amazon-ec2-instance/ did not completely solve my problem, but did give me a good start. So eventually, the command I’m using is:
lsblk # to look at the space
sudo growpart /dev/xvda 1
sudo xfs_growfs -d /mnt
TODO: I heard that “AWS S3 + Lambda” is much cheaper.
Step 5. update eb app¶
Please do not delete or rebuild your env, otherwise you will have to configure a lot of things.
Things I have done, install many python packages, e.g. dash, and some bioinformatics tools, htslib.
Now I have a new dash app, all I need to do is upload this as a zip folder and then deploy it, all using a browser!
Where to upload and deploy¶
link: https://us-west-2.console.aws.amazon.com/elasticbeanstalk/home?region=us-west-2#/environments
Find your application, click Actions and go to view versions.
Click upload first, when it is finish, then choose this new app and deploy it.
Then you can view deploy logs
Once you have successfully deployed, you can then use the ssh terminal to do further updates, like I need to download hg19 to this /var/app/current folder.
Upload size error¶
nano /etc/nginx/nginx.conf
add client_max_body_size 50M;
. Then service nginx restart
or systemctl reload nginx
.
The bw file I’m using “https://www.dropbox.com/s/ojqvi0pbnw975cl/SRR8056671_293T.rmdup.uq.bw”
server {
listen80 default_server;
access_log /var/log/nginx/access.log main;
client_header_timeout 60;
client_body_timeout 60;
keepalive_timeout 60;
client_max_body_size 50M;
gzipoff;
gzip_comp_level4;
gzip_types text/plain text/css application/json application/javascript application/x-javascript text/xml application/xml application/xml+rss text/javascript;
# Include the Elastic Beanstalk generated locations
include conf.d/elasticbeanstalk/*.conf;
}
Notes¶
eb logs
eb ssh
Your DASH stdout is here: /var/log/web.stdout.log
re-build instances¶
Today when I check again on Easy-Prime, the server is down! And I found that the enviorment is just gone. I have to start over. My AWS EB instance was replaced with a new one. I checked online, this could be caused by AWS auto-scaling. But I’m still not sure why it happened. Now I have to reinstalled everything.
Memory allocation problem¶
5891 webapp 20 0 1388604 199624 51280 S 0.0 19.8 0:02.30 gunicorn
17153 webapp 20 0 234568 17876 2952 S 0.0 1.8 21:46.91 gunicorn
solution: find the one with higher memory usage and kill it. top -u webapp