|
There are lots of YouTube downloaders, but it is a good exercise to write one yourself from first principles. You may be surprised that one can be written in Shell.
This script needs re-writing every year, because YouTube keep changing their formats.
<head> ...
<script> ...
some.function("IRRELEVANT_URL");
some.function("THE_URL");
</script>
<title> ...
This Shell script gets the "home page" for the video, pipes it into a separate script to extract the URL, and then downloads that URL.
url=`wget -q -O - "$1" | extracturl` wget -O - "$url" > y.flv
To extract the URL, we make a Shell script "extracturl" consisting of various programs piped together.
("extracturl" could be a script, or a
Shell function.)
TIP: See what the first pipe does, then the first two, then the first three, until you have all of them piped together.
The following are to be piped together:
This gives a single line, looking something like this (URL in bold):
<script>var yt = yt || {};yt.preload = {};yt.preload.counter_ = 0;yt.preload.start = function(src) {var img = new Image();var counter = ++yt.preload.counter_;yt.preload[counter] = img;img.onload = img.onerror = function () {delete yt.preload[counter];};img.src = src;img = null;};yt.preload.start("http:\/\/r13---sn-q0c7dn7z.c.youtube.com\/crossdomain.xml");yt.preload.start("http:\/\/r13---sn-q0c7dn7z.c.youtube.com\/generate_204?sver=3\u0026source=youtube\u0026expire=1361552880\u0026id=3ae49d53cb5b7076\u0026signature=C11CE1F97394B7D012FDC4A5279B1DF38E5858BA.79A6AC7DB3888987E1B9D7B576D66B23D7A6D25B\u0026mv=m\u0026sparams=algorithm%2Cburst%2Ccp%2Cfactor%2Cid%2Cip%2Cipbits%2Citag%2Csource%2Cupn%2Cexpire\u0026burst=40\u0026ms=au\u0026cp=U0hVRlVQUF9NSkNONV9NSlRJOk5zempyMnY3al9B\u0026ipbits=8\u0026upn=GLEXJMqd2sQ\u0026factor=1.25\u0026algorithm=throttle-factor\u0026ip=MYIP\u0026itag=34\u0026mt=1361529971\u0026key=yt1\u0026fexp=909708%2C914072%2C916611%2C920704%2C912806%2C902000%2C922403%2C922405%2C929901%2C913605%2C925006%2C931202%2C908529%2C920201%2C930101%2C906834%2C926403%2C901451");</script><title>titanic in 5 seconds - YouTube</title><link rel="search" type="application/opensearchdescription+xml" href="http://www.youtube.com/opensearch?locale=en_US" title="YouTube Video Search"><link rel="shortcut icon" href="http://s.ytimg.com/yts/img/favicon-vfldLzJxy.ico" type="image/x-icon"> <link rel="icon" href="//s.ytimg.com/yts/img/favicon_32-vflWoMFGx.png" sizes="32x32"><link rel="canonical" href="/watch?v=OuSdU8tbcHY"><link rel="alternate" media="handheld" href="http://m.youtube.com/watch?v=OuSdU8tbcHY"><link rel="alternate" media="only screen and (max-width: 640px)" href="http://m.youtube.com/watch?v=OuSdU8tbcHY"><link rel="shortlink" href="http://youtu.be/OuSdU8tbcHY"> <meta name="title" content="titanic in 5 seconds">
[2 out of 5]
tr '"' '\n' | grep "generate_204"
You now have the extracted URL:
http:\/\/r13---sn-q0c7dn7z.c.youtube.com\/generate_204?sver=3\u0026source=youtube\u0026expire=1361552880\u0026id=3ae49d53cb5b7076\u0026signature=C11CE1F97394B7D012FDC4A5279B1DF38E5858BA.79A6AC7DB3888987E1B9D7B576D66B23D7A6D25B\u0026mv=m\u0026sparams=algorithm%2Cburst%2Ccp%2Cfactor%2Cid%2Cip%2Cipbits%2Citag%2Csource%2Cupn%2Cexpire\u0026burst=40\u0026ms=au\u0026cp=U0hVRlVQUF9NSkNONV9NSlRJOk5zempyMnY3al9B\u0026ipbits=8\u0026upn=GLEXJMqd2sQ\u0026factor=1.25\u0026algorithm=throttle-factor\u0026ip=MYIP\u0026itag=34\u0026mt=1361529971\u0026key=yt1\u0026fexp=909708%2C914072%2C916611%2C920704%2C912806%2C902000%2C922403%2C922405%2C929901%2C913605%2C925006%2C931202%2C908529%2C920201%2C930101%2C906834%2C926403%2C901451
[3 out of 5]
Warning: "\" in the string has special meaning to sed.
Warning: "\" and "&" in the string have special meaning to sed.
sed 's|generate_204|videoplayback|g'
This give the edited URL, looking like this:
http://r13---sn-q0c7dn7z.c.youtube.com/videoplayback?cp=U0hVRlVQUF9NSkNONV9NSlRJOjg0OHRUaFZKTTBJ&id=3ae49d53cb5b7076&signature=0FDF981AB22BCCB53F4B2059436D6837C58E1BD1.3F69089BEF997527BDE4B91DEFA0902E8DAFB37F&ip=MYIP&ms=au&source=youtube&expire=1361552880&key=yt1&factor=1.25&ipbits=8&mv=m&sver=3&mt=1361530031&upn=xfVDC2jSFOk&fexp=906357%2C916807%2C920704%2C912806%2C902000%2C922403%2C922405%2C929901%2C913605%2C925006%2C908529%2C920201%2C930101%2C906834%2C926403%2C901451&sparams=algorithm%2Cburst%2Ccp%2Cfactor%2Cid%2Cip%2Cipbits%2Citag%2Csource%2Cupn%2Cexpire&itag=34&algorithm=throttle-factor&burst=40
[4 out of 5]
[5 out of 5]

On Internet since 1987.