Yes, it is true that if we all migrated to XHTML,
it would make it easier for programs to process content.
But are you going to re-write 10 billion web pages?
Some of the XHTML vision is incredibly utopian:
"XML requires user-agents to fail when encountering malformed XML".
Q. Would you use such a browser?
i.e. One that wouldn't allow you view a favourite site
because it had "malformed" XHTML.
Or would you (as the whole history of the Web shows)
simply move quietly to a different, more tolerant browser?
Anyone selling an unforgiving browser won't have a lot of users.
"The recommendation for browsers to post an error
rather than attempt to render malformed content
should help eliminate malformed content."
- Yeah, right.
Because authors have nothing better to do.
As for the idea of making it easier to display on small devices:
Well, not only does
my tablet
display malformed HTML beautifully with no problem,
but so did
my old PDA
and
even
my old WAP phone!
Martian Headsets, Joel Spolsky, March 17, 2008,
on HTML standards
Maybe "the way the web "should have" been built would be to have very, very strict standards and every web browser should be positively obnoxious about pointing them all out to you and web developers that couldn't figure out how to be "conservative in what they emit" should not be allowed to author pages that appear anywhere until they get their act together.
But, of course, if that had happened, maybe the web would never have taken off like it did, and maybe instead, we'd all be using a gigantic Lotus Notes network operated by AT&T. Shudder."
About the idea that old web pages need to "change" to conform to standards:
"Those websites are out of your control. Some of them were developed by people who are now dead.
...
The idealists don't care: they want those pages changed.
Some of those pages can't be changed. They might be burned onto CD-ROMs. Some of them were created by people who are now dead. Most of them created by people who have no frigging idea what's going on and why their web page, which they paid a designer to create 4 years ago, is now not working properly."
Again, if the browser doesn't display the old pages, what will most people do?
That's right.
Dump the browser.
For new projects, you could use XHTML
I should say that for new projects, done by professionals, using XHTML may be a good idea.
It will make it easier for your team to re-purpose your content in the future.
I am only pointing out that this cannot be the entire world. One must also consider:
Old pages.
New pages written by people who do not conform to standards.
(You might say "amateurs".
Or you might say "people with other jobs".)
There are millions such sites.
There are new such sites created every day.
Consider even just the web pages of all computer lecturers at DCU.
How many validate their HTML?
XHTML may not replace HTML but rather be a parallel subset of the web.
HTML5 seems opposed to the XHTML idea
Instead of everyone moving from HTML to XHTML,
there has been a demand to update HTML (which, strangely, has had no new official definition since 1999).
HTML 5 will allow XHTML syntax of course, but will not enforce it.
It is tolerant like HTML 4.
It will not demand the XHTML features of
lower case tag names, quoting attributes, attribute has to have a value,
close all empty elements.
Rather than reject badly-formed HTML,
HTML5 tries to define
the required processing for such documents.
This site
points out that a spec can declare tags obsolete,
but that doesn't mean browsers should reject that tag.
"Specs don't need to be backwards compatible.
Instead, the better solution is that user-agents should be backwards compatible, by supporting multiple specs."
That is, browsers can support HTML5 and HTML4 and HTML3 and before.
Human-readable web and machine-readable web may stay separate
In practice, instead of mixing XML and HTML they are often entirely separate services.
e.g. Company provides:
5,000 HTML pages displaying products and prices.
1 single XML dump of all machine readable price data for whole website,
of size a few M of text.
Remote bots grab this dump with one request
instead of making 5,000 separate requests for tiny XML fragments.