Dr. Mark Humphrys

School of Computing. Dublin City University.

Home      Blog      Teaching      Research      Contact

Search:

CA249      CA318      CA425      CA651

w2mind.computing.dcu.ie      w2mind.org

Missing
DCU student

CASE3 student Paul Bunbury is missing since Thur 2 Feb 2012.
See appeals on crime.ie and garda.ie and facebook.

He is a great coder. See DCU page and boards.ie page.
He won major coding contests in 2010 and 2011.
He is author of the brilliant "FloodItWorld".
DCU can confirm that in Jan 2012 he passed all 6 modules comfortably.


Remote and Network Computing



telnet and ftp (and their successors)

These two commands (and their secure successors) have been for decades the two fundamental commands of the Internet / remote computing.

  1. telnet (host) - Login to remote host
  2. ftp (host) - Transfer files to/from remote host
With telnet you get a command-line, with ftp you get a read-write file system.

Origin:

  1. telnet, 1971 (and here).
  2. ftp, 1971.

Read-only ftp (File Transfer Protocol) was what people used to publish files and archives online before the Web (http - Hypertext Transfer Protocol). It is still sometimes found being used for this, and most browsers should be able to read files through it.

ftp is now more often used in read-write mode for uploading web sites. e.g. Your web hosting company uses a UNIX server. You periodically upload your web site (edited in Windows) onto it with ftp. There are many graphical drag-and-drop ftp clients, and even programs that make the site into a full Windows drive.

These two core commands have been replaced by secure versions:

  1. telnet -> ssh
  2. ftp -> sftp / scp (both use ssh) or ftps (uses ssl)



DCU remote access for students

You may or may not be able to remotely ssh or sftp to:

student.computing.dcu.ie = 136.206.11.245 (CA, on servers subnet)


How to login to Linux at DCU




remote email

use POP3 or IMAP protocol to talk to server:
  mailhost.computing.dcu.ie
  mail.dcu.ie


Internet access in the past

When I was an undergraduate in the 1980s:
  1. Universities support their own dial-in access, since you can't buy such a service anywhere.

  2. Nobody in the phone company, TV, advertising, business, marketing, media, the press, the government or society at large has heard of the Internet. In the Computer Science department of the university, the Internet is for researchers and postgraduates, if they use it at all. It is not even mentioned when undergraduates are lectured about Computer Networks. (A little-told story is how even many computer-networks researchers ignored the rise of the Internet.) If undergraduates find out about it themselves and are interested, they may be given access by special permission.

  3. After you leave college, you try to piggyback onto old college accounts, friends and contacts still at college, and so on, because you can't actually buy Internet access anywhere. You have to fight hard to get onto this underground thing that no one has heard of.



Accessing UNIX remotely and from Windows

Running UNIX GUI applications:


I use the following two to run a Windows GUI with a UNIX command-line underneath. My files on the UNIX server appear as just another read-write Windows drive. I can use Windows apps to edit them. And I have a UNIX command-line always open on which I can run scripts to process them:



FTP scripting

Of course, for repetitive tasks, drag-and-drop is not a better interface than being able to write automated scripts (this will be a theme of this course). You can write ftp scripts ("macros") and call them from Shell scripts:


HTTP scripting

You can also do HTTP GET or POST scripting from the command-line.
Some tools that do this:


  1. lynx
    • does HTTP GET:
        lynx -reload -source URL
      
    • does HTTP POST:
        cat DATA | lynx -reload -source -post_data URL
      

  2. wget
    • gnu.org
    • manual
    • does HTTP GET:
        wget -q -O - URL
      
    • Sites that block scripts:
      If a site won't let a script see its content, you can set User agent to pretend to be a browser:
        UserAgent="Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1)"
        wget -q -O - 	-U "$UserAgent"	URL
      
    • To do the "Link checker" Java practical in shell script, something like:
        wget --spider --force-html -i file.html
      

  3. cURL
    • does HTTP GET
    • does HTTP POST



Parsing XML / HTML

There is support in many programming languages for parsing XML / HTML.
The problem is they may fail on badly-formed XML / HTML (i.e. lots of HTML).
  1. Javascript

  2. Java - Parsing HTML in Swing

  3. Command-line tools that you can use in shell scripts
    • xpath - Parse well-formed XML
    • example XML file
      # extract nodes delimited by <choices>
            cat test.ajax.xml | xpath //choices
      
      # extract nodes delimited by <item> within those
            cat test.ajax.xml | xpath //choices//item
      
      # get first node only
            cat test.ajax.xml | xpath "(//choices//item)[1]"
            cat test.ajax.xml | xpath "//item[1]"
      
      # get text inside tags
        cat test.ajax.xml | xpath "//item[1]" | xpath "//label[1]"		
        cat test.ajax.xml | xpath "//item[1]" | xpath "//label[1]/text()"   > outputfile
      
      

  4. Error-tolerant command-line tools:

Strategy for parsing HTML:
  1. Use error-tolerant readers like TagSoup to convert badly-formatted HTML to well-formatted XHTML.
  2. Can now parse XHTML with other, more picky programs like xpath.


Working remotely

Idea: Your files are "on the network" somewhere. You can access them and make changes to them from anywhere. All copies stay in synch.

This is what you actually have within DCU (can move from terminal to terminal, accessing files at central server). The idea is that you would have this at home (and when travelling etc.) as well.


Simplest solution - 1 copy of files
  1. Read files from server, and copy changed files back to server as you go along. Can do this with ftp now, but really need broadband to work with remote files. Need high-speed broadband to work with large remote files.

More complex solution - 2 copies of files - Work on machine which has synchronised mirror of server files - Have to keep copies in synch

  1. Synchronise over the network.
    Read files from server at start of session, copy changed files back at end of session.
    e.g. Say have broadband modem, always-on:
    1. When leave office, set synchronise program running with home. By time get home, files are synchronised. Work on them locally.
    2. When going back to office, start synch program again. By time get in, files are synchronised again.

    Or:

  2. Physically bring laptop (or flashdrive / external hard disk) to/from work to synchronise.





Feeds      HumphrysFamilyTree.com

Bookmark and Share           On Internet since 1987.