Dr. Mark Humphrys

School of Computing. Dublin City University.

Home      Blog      Teaching      Research      Contact

Online coding site: Ancient Brain

 

Search:

CA114      CA170

CA668      CA669      Projects

Free AI coding exercises


  URI schemes

DCU proxy servers


The Web



Performance (server-side)

Many things can be done on the server-side to speed up Web performance:
  1. Multi-threaded. Start servicing new client while still responding to last client.

  2. Cache of (maybe huge numbers of) files in memory.
    Disk reads are slow. So don't make separate disk access for every file request. Instead maintain cache in RAM of frequently accessed files and/or small files which are easy to hold in RAM.
    e.g. Search my genealogy site. Searches text of web pages. Over 1,700 web pages.
    But search is instant. Think it is caching every single web page in RAM.
    Web pages are text files and so are small compared to images, video. 1,700 pages is only about 20 M total. Could easily hold all that in RAM.
    Entire site is about 30 G. So the HTML text is less than 1/1000 of the site. This is normal enough.

  3. Multiple disks. Site could be spread over multiple disks to allow many reads going on at once.

  4. Defragmentation of disks. Reduce seek times.

  5. Multiple servers. "Server farm".

  6. Content delivery network - distributed distribution of resources.


Related to how the site is designed:

  1. Minification - Do various transforms to JS and other files to reduce size (reduce download time) and make parsing faster.
    Text files tend to be tiny anyway.

  2. Bundling of Files. One network request for a bundled JS file for the page, instead of 20 network requests for 20 JS files.
    Same for CSS - bundle into one CSS file.
    Reducing network calls can make a big difference.
    e.g. At time of writing I have 4 JS files for each page on Ancient Brain that I bundle into one JS file page.js.
    And I have 12 CSS files for each page that I bundle into one CSS file main.css.

  3. Small / low-resolution images (for any images used inline).
    Can click to expand.
    Definition of "small" changes over time.


  


For high-demand sites: Multiple copies of entire site - "server farm" - front end routes requests to different CPUs.

Problem: OK to have all (small size) requests come in through one front end and get routed to searching nodes.
Not OK to have all (large size) replies go back through one front end - bottleneck.
Solution: TCP handoff - trick to have the searching node reply directly in a manner that is invisible to client.
The reply load is therefore distributed over all the nodes.




Caching in HTTP




Server logs

HTTP servers can log all accesses. Can have separate log for errors.



Typical web server logs.
(Apart from being colour-coded. Normal logs are not colour-coded.)
From askapache.com.





URI schemes

Shows how the Web has tried to provide a unifying interface to all Internet protocols, data and activities.
  

Some URL formats.


  
URI schemes listed above (in use): Obsolete: Others (media): Others (phone): Others:




HTTP client

Web browser

Uses MIME types.
(a) Plug-in - Runs inside browser process.
(b) Helper application - Separate process.





Keeping state

Relating one client-server stateless request with other client-server requests.

Identify user (pay-to-view, register, personalisation).
Shopping carts.






Performance (client-side)

Many things can be done on the client-side to speed up Web performance.

Actually, all of these things, though taking place on the client, involve server support too:

  1. Client-side caching
    • Browser maintains cache (in memory or disk or both).
    • How to see your cache files in various browsers.
    • Server tells you what to cache.

  2. Site-wide (or ISP-wide) cache via proxy server.

  3. Lazy load - of images etc.

  4. Infinite scroll - Load more of page on scroll to bottom.
    Use with moderation. See article about why this is only suitable for some types of sites.

  5. Delayed loading of resources.
    Delayed running of scripts.
    Fetch some resources / run some JS only after initial page is rendered.



DCU proxy servers

DCU is (apparently) not using proxy servers any more. But they are still in use outside DCU.
  
In DCU, some machines may communicate with the outside world through a proxy server.
Some communicate directly (not through a proxy).


  1. wwwproxy.computing.dcu.ie = 136.206.11.243 (forwards requests through 136.206.11.249)
    • port: 8000

  2. proxy.dcu.ie alternates between different IP addresses (for load balancing)
    • port: 8080 or 3128
    • lookup shows it alternates randomly between:
      1. 136.206.1.17
      2. 136.206.1.20


To set proxy, something like:
  1. Firefox - Tools - Options - Advanced - Network - Settings
  2. IE - Tools - Options - Connections - LAN settings

You may use a proxy auto-config (PAC) file:

  1. https://computing.dcu.ie/proxy.pac
  2. http://proxy.dcu.ie/proxy.pac


Test the IP address other sites see:



ancientbrain.com      w2mind.org      humphrysfamilytree.com

On the Internet since 1987.

Wikipedia: Sometimes I link to Wikipedia. I have written something In defence of Wikipedia. It is often a useful starting point but you cannot trust it. Linking to it is like linking to a Google search. A starting point, not a destination. I automatically highlight in red all links to Wikipedia and Google search and other possibly-unreliable user-generated content.