NGINX: “A Server Built for a Modern World”

Cause life’s a constant change, and nothing stays the same~

– Constant Change, Jose Mari Chan

 

Hello there! Haha so what’s with the cheesy line from Jose Mari Chan’s Constant Change (which is a song we also do love in the family!)?  Recently,  I was able to read up and have a hands-on experience with Nginx.

Nginx is a powerful web server that surfaced in the early 2000s. Its development was motivated to address the C10K problem where processes may only be able to handle up to 10,000 simultaneous connections due to operating system and software constraints. This problem may be unimaginable nowadays especially with our hi-tech software and hardware that can process a gazillion bits per minute. But back in the 1990’s and the early 2000’s, this was the case.

During that time, it was Apache that was the most popular (actually until now, Nginx only comes second) web server. Released in 1995, it is one of the forerunners that gave birth adn shaped the World Wide Web that we know today.

Nginx aimed to handle many concurrent applications, survive high traffic,  and serve webpages extremely fast which was not part of the original purpose of Apache designers. Who would have thought that network traffic would boom this big.

Actually, while reading on its history, I came to remember one of our blog posts from before regarding the original intentions and purpose of TCP/IP and how time has also shaped its direction.

Nginx is relatively young and has less features than Apache but as Clement Nedelcu in his book Nginx HTTP Server says “it was intended for a more modern era”.

Armed with a knowledge of the motivation behind Nginx, let us now try to see how can we get started with Nginx and what basic configurations we can make to let it suit our needs. For this post, we’ll be running Nginx on a Ubuntu Server 14.04.

Nginx Installation

Install Nginx with apt-get.

Nginx also offers the option to compile it from source and this is usually for OS distributions who don’t provide it in their respective package managers or whose package versions may already be outdated. Compiling from source may also give greater flexibility as which modules to be included can be specified during compilation. But for our purpose, we can just use apt-get to install Nginx.

sudo apt-get update
sudo apt-get install nginx

Upon installation, Nginx automatically starts. To see this, you may do a

ps -ef  | grep nginx

Screen Shot 2015-12-12 at 9.29.24 PM.png

We see that Nginx started one master process and 4 more worker processes that are ready to accept connections. It is also noteworthy to see that the master process has root as its owner and the worker processes have www-data.

The main purpose of the master process is to read and evaluate the configuration file and maintain the worker processes. The worker processes are the ones who process actual requests. In case one worker dies, the master process ensures that another one is spawned.

Nginx Default Configuration

Once we have Nginx up and running, let’s now take a look at the default configuration of Nginx. Head over to the default configuration file at:

/etc/nginx/nginx.conf

Throughout the file, you may notice one liner configurations that consist of an attribute name, a value, and a semi-colon to end the line, these are what we call directives. Directives make the majority of Nginx configuration files.

Let’s take a closer look on some of the parts:

user www-data;
worker_processes 4;
pid /run/nginx.pid;
  1. user: It’s good practice to have a dedicated user for our worker processes rather than having it owned by root as well. Since the default comes with a non-root user already, we can leave this as it is or if you want you still create a dedicated user for your worker processes.
  2. worker_processes: Remember the 4 worker processes we found a while ago, this directive is the one that controls it. The number of worker processes usually is set to be the same as the number of cores in the computer so that it can be efficient. Specifying more than your actual number of cores may be detrimental to the system and specifying less could lead to under-utilization of your computer’s power.
    In case on of your worker processes dies, another one is spawn so that the number of workers specified in your configuration file is still met.
  3. pid: The file where the process id of the master Nginx process is stored.

Throughout the configuration file, you may also see directives enclosed by curly braces ({}) and we call these blocks. These directives are grouped together in blocks as they are provided by the same module (such as Events in our example below).

events {
 worker_connections 768;
 # multi_accept on;
}
  1. worker_connections: maximum number of connections each worker worker process can accommodate.
http {

 ...

 include /etc/nginx/mime.types;

 ...

 access_log /var/log/nginx/access.log;
 error_log /var/log/nginx/error.log;

 include /etc/nginx/conf.d/*.conf;
 include /etc/nginx/sites-enabled/*;
}

Location of logs can also be customized via the access_log and error_log directives. The default resides at /var/log/nginx/access.log for access logs and /var/log/nginx/error.log for error logs.

Nginx allows inclusions of external files with the include directive. Included files appear exactly where they are included with the include directive.

The last line on the above example includes files under the sites-enabled folder. This folder includes the different config files for each of your sites if your server hosts multiple sites. By convention, one file would contain the configuration of one site to properly separate configurations and  adjustments can easily be made as no two sites are tightly coupled with one another. By default, we have one config file under sites-enabled, and this is default.

Opening it, we can see more configuration blocks and one of them is the following:

server {
 listen 80 default_server;
 listen [::]:80 default_server ipv6only=on;

 root /usr/share/nginx/html;
 index index.html index.htm;

 # Make site accessible from http://localhost/
 server_name localhost;

On our server block, we see that by default, Nginx listens to port 80 and the root was set to be in the nginx/html folder and  default server name is localhost.

Before we try changing the defaults, let’s ensure that our setup works and can be accessed on a browser:

Screen Shot 2015-12-13 at 5.28.33 PM.pngNow, let’s try to change some of these defaults:

a. Changing the default listening port

To change the port on which Nginx is listening, let us change the default port 80 on the server block.

listen 80 default_server;
listen [::]:80 default_server ipv6only=on;

Say we want it to listen at port 14344 instead, we now have:

listen 14344 default_server;
listen [::]:14344 default_server ipv6only=on;

To apply our changes, make sure to reload Nginx every time you edit the configurations:

sudo service nginx reload

In case, the service doesn’t want to restart or your changes weren’t applied, try to check if your configuration file is valid (i.e. has the correct syntax; no typographical errors, etc). For this purpose, Nginx provides a way to test the configuration file’s syntax:

 sudo nginx -t

Screen Shot 2015-12-12 at 10.43.14 PM.png

Once we have successfully reloaded our Nginx config, let’s now check that this works. Let’s refresh our browser with no ports (as 80 is set as default), our site is now nowhere to be found.

Screen Shot 2015-12-12 at 10.43.45 PM.png

But if we specify port 14344 then refresh, we can now see our site:

Screen Shot 2015-12-12 at 10.43.58 PM.png

b. Changing the root

Say, we have our application directory at /srv/my-app. And our static pages to be served by Nginx is at /srv/my-app/public. To set the root from which Nginx will be looking for the files it will render (i.e. static pages), we change the root via:

root /srv/my-app/public;

When this is your root, Nginx also by default looks for your index file (index.html, index, htm, etc) in this folder. Now when we refresh, our <ip>:14344, we can see the index page that I added at /srv/my-app/public.

Screen Shot 2015-12-12 at 11.08.24 PM.png

 

index.html at /srv/my-app/public


&nbsp;
<h1>Welcome to the index page!</h1>
&nbsp;
<h4>Because all great things start with a single step, right?</h4>
&nbsp;

c. Using Custom Error Message Pages (Error 500, 404, etc)

In relation to (b) where we changed the application root and saw a different index file, we can also specify custom error pages that we want shown in our site. This way, we can specially customize them to maybe add further instructions on what to do if their actions caused such errors.

We can customize such errors by the error directive in sites-enabled/default:

error_page 404 /my_404.html;

And add the file my_404.html to /srv/my_app/public:

I have this my_404.html file:

<div style="min-height: 100%; background-size: cover; background-image: url('traffic.jpeg');"><center style="padding: 150px; color: white; font-size: 40px;">
<h1>404: Not Found</h1>
<h4>But don't worry, not all who wander are lost. :)</h4>
&nbsp;

</center></div>

So when we go to an undefined page in our site, our customized 404 page is displayed.

Screen Shot 2015-12-12 at 11.56.42 PM.png

e. Customized Cache Headers from Nginx

Nginx delivers static pages really really fast as it accesses directly the file system rather than having the app server process them as requests. In addition to this, Nginx offers caching in which you can specify which files you want cached and for how long. For this purpose, we can add the following inside the server block of sites-enabled/default:

location ~* \.(jpg|jpeg|png|gif)$ {
 expires 30d;
}
location ~* \.(ico|css|js)$ {
 expires 5h;
}

What this does is set the expiry of files with jpg, jpeg, png, and gif  extensions to be after 30 days and for files with ico, css, and js extensions to be after 5 hours.

When we try and load our traffic.jpeg file, and inspect the headers of our jpeg file, the 30 day expiration we set is reflected.

Screen Shot 2015-12-13 at 1.31.41 AM.png

f. Using Nginx as a Reverse Proxy Server

Reverse proxy servers act as an intermediary between clients and the server. It accepts requests on behalf of the server and is the one who directs traffic to them. Reverse proxy servers provide an additional level of abstraction and control to ensure the smooth flow of traffic.

For this purpose, Nginx provides the upstream block where we can specify the servers unto which we want our traffic routed.

Say we have this Sinatra application running on localhost:4567.


require 'sinatra'

set :bind, '0.0.0.0'
set :port, '4567'

get '/' do
"Hello, world! I am from Sinatra ONE"
end

We can forward connections to it via adding the following upstream block in the file sites-enabled/default:

Screen Shot 2015-12-13 at 1.55.15 AM.png

And an inner location block to the server block:

Screen Shot 2015-12-13 at 6.19.41 PM.png

What this does is just to forward requests matching / to the app1 server group we declared before.

And then proceeding to port 14344 of our server via the browser, we can see:

Screen Shot 2015-12-13 at 2.01.04 AM.png

Awesome, isn’t it? 🙂

In addition to acting as a reverse proxy server, Nginx is also capable of being a load balancer, when you specify multiple servers on the upstream block, by default Nginx routes connections to them via a round robin fashion (other routing options include least connected, and IP hash which is best for cases where sessions must be persistent).

Let’s try running a duplicate Sinatra on port 4568. We can tweak the text a little bit so that it is indicative of being the second Sinatra application. Let’s also add an additional server directive on our upstream block:

Screen Shot 2015-12-13 at 1.55.56 AM.png

 

Refreshing our page multiple times, we can see that some requests get forwarded to our first Sinatra application while some to the second one at port 4568.

output_yvMZwp.gifIn  case we want more traffic to be redirected to on of our servers, we can add a weight attribute to the server directive in our upstream block:

server localhost:4567 weight=2;
server localhost:4568;

So if we have this, for every three requests, 2 goes to port 4567 and one goes to 4568.

g. Specifying Path for ELB Health Check

On our previous post on load balancers, we were able to talk about health checks, the ones used by load balancers to determine if our server is still up. On one of our examples, we set our health check to be the following:

Screen Shot 2015-12-13 at 6.09.39 PM.png

This is OK but could still be further optimized. We can tell Nginx to immediately return a success status to AWS instead of passing it to our server to be processed and loaded (i.e. / loads the homepage which could contain image, css, js, etc files) before it can be returned to AWS. Loading the whole page might just add unnecessary load to our server.

We can tell Nginx to immediately return a OK status by adding the following location block inside the server block of our sites-enabled/default file:

location /elb-status {
 return 200 'Alive!';
 add_header Content-Type text/plain;
}

What we did above is just to add a specific location detector so that when health checks are made at /elb-status, a 200 is immediately returned. For this to be accessed by ELB, make sure that you set the Health Check Ping Path to be /elb-status and Port to be 14344 at your ELB configurations.

In adding location blocks, we must be careful on their sequence. If multiple location blocks match a certain request, only the first one is considered and the succeeding ones are ignored.

For reference, here are the final sequence of location blocks that we added in this tutorial:

upstream app1 {
 server localhost:4567;
 server localhost:4568;
}

server {
 listen 14344 default_server;
 listen [::]:14344 default_server ipv6only=on;

 root /srv/my_app/public;
 # root /usr/share/nginx/html;
 index index.html index.htm;
 # Make site accessible from http://localhost/
 server_name 54.201.203.28;

 location /elb-status {
 return 200 'Alive!';
 add_header Content-Type text/plain;
 }

 location ~* \.(jpg|jpeg|png|gif)$ {
 expires 30d;
 }


 location / {
 # First attempt to serve request as file, then
 # as directory, then fall back to displaying a 404.
 # try_files $uri $uri/ =404;
 # Uncomment to enable naxsi on this location
 # include /etc/nginx/naxsi.rules
 proxy_pass http://app1;
 }

 # Only for nginx-naxsi used with nginx-naxsi-ui : process denied requests
 #location /RequestDenied {
 # proxy_pass http://127.0.0.1:8080;
 #}

 error_page 404 /my_404.html;
 error_page 500 /50x.html;

...

So there, we saw what Nginx is and the motivations behind its birth, as well as some of the configurations we can change to better suit our needs. Thank you so much for reading! 🙂

P.S. Recently, I’m very privileged to get to know a lot of technologies and write posts about them (hehe, you may have noticed the influx of technical posts the past weeks). Haha those won’t be possible without the guidance and mentorship of the super awesome Joshua Lat! Super thank you Josh for all the guidance and support! 🙂
Thanks again, dear reader and ’til next time!

Sources:

  1. Nginx HTTP Server, 2nd edition by Clement Nedelcu
  2. Digital Ocean: Custom Nginx Error Pages on Ubuntu 
  3. Digital Ocean: Nginx Configuration Optimization
  4. Digital Ocean: Reverse Proxy vs Load Balancer
  5. Digital Ocean: Reverse Proxy Server

 

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s