Dr. Mark Humphrys

School of Computing. Dublin City University.

Home      Blog      Teaching      Research      Contact

Search:

CA249      CA318      CA425      CA651

w2mind.computing.dcu.ie      w2mind.org


Lab - stock prices


getprice (stock symbol)
Get the price of that stock.
Usage like: getprice GOOG
  1. Download quote page. Parse to extract price.
  2. See Parsing XML / HTML

  3. Hard to parse: http://bigcharts.marketwatch.com/quickchart/quickchart.asp?symb=SYMBOL
    • grep "Last:" | head -1 | various sed's

  4. Easier to parse: http://finance.yahoo.com/q?s=SYMBOL because stock price is delimited by tags.
    • grep "Last Trade:"
    • grep "yfs_l10_SYMBOL"
    • Something like:
      <span id="yfs_l10_goog">540.30</span>

    • If clean up HTML first to make it well-formed XHTML, can use xpath to parse it properly:
       
      # search for <span tag(s) with attribute id="yfs_l10_goog"  
      cat cleanedupfile.xhtml | xpath '//span[@id="yfs_l10_goog"]'     
      
      # get first one    
      cat cleanedupfile.xhtml | xpath '(//span[@id="yfs_l10_goog"])[1]'      
      
      # get contents    
      cat cleanedupfile.xhtml | xpath '(//span[@id="yfs_l10_goog"])[1]/text()'    > outputfile 
      

  5. In general, remote HTML page may not be written to allow a machine easily find the stock price.
  6. Downside of script - has to be re-written if HTML format changes.
  7. Web scraping issues.


Feeds      HumphrysFamilyTree.com

Bookmark and Share           On Internet since 1987.