2010-09-19
大家好!我是 Helen,是今年創用 CC 的暑期工讀生,目前就讀政治大學英國語文學系(輔修資訊科學),對電子文學相關議題有興趣,因此這次實習也做了一些電子書與電子檔案標示語言相關研究。
引言
2008年,微軟總裁 Steve Ballmer 在與華盛頓郵報記者訪談中提出:十年內所有實體印刷媒體都會消失,所有內容將電子化透過網路傳遞。
距離 Steve Ballmer 的預測只有兩年,出版業就面臨電子書新技術引進的挑戰。在2010年第二季,亞馬遜電子書銷售量首次超越實體出版,每有1本實體書就有1.43本電子書出售。蘋果、索尼、亞馬遜、微軟也紛紛嵌入電子閱讀技術到他們的各種行動通訊設備中。同樣值得注意的,今年7月28日台灣的第一個網路電子書城 Pubu 成立。
到底何謂電子書?電子書將如何影響平面媒體?會如何激發創造力、催化開放標準以及自由思想的普及?或是科技的進步會使電子書又再經歷一次革命?希望能從中探討一些問題,也為未知的可能性留一些想像空間。
IDPF 與 EPUB 介紹
談及電子圖書相關議題,第一個就必須先介紹 EPUB。EPUB 由國際數位出版聯盟(IDPF)在2007年發表的電子書出版官方標準格式,取代了原來的開放式電子書標準 Open eBook。現今大多數的閱讀器〈亞馬遜的 Kindle 是一個例外〉都支持 EPUB 的檔案格式。
目前 EPUB 最新版本是2010年七月發表的2.0.1版,而下一版,2.1版預計在2011年一月發表。
EPUB 的目標在於提供一個開放且自由的電子書標準,包含可搜尋的文字與圖片、在跨閱讀器及閱讀平台都可正常顯示、並提供數位著作權與版權資訊內嵌機制。
EPUB 標準共包含三份規格:開放出版結構 (Open Publication Structure;OPS)、開放包裹格式 (Open Packaging Format;OPF)、以及 OEBPS 容納格式 (OEBPS Container Format;OCF). 他們之間的關係可用一個圖表示
EPUB 標準最棒的地方,簡言之,就是它是由網頁組成並壓縮成 EPUB 檔:使用者想出版自己電子書時不需要對程式語言或是電腦軟硬體有極深的認識,出版門檻相對比較低。雖說數位版權以及著作權問題也將因此而生,但如此低的出版門檻更能激發創意以及使創作在網路上廣為傳播。
文字的排版與呈現
EPUB 的內容與呈現由 OPS 規格規範。內容的標示是使用 XHTML 的一個子集,主要是移除所有互動性的內容標示〈例如表單製作〉而內容樣式的呈現主要是由 CSS 樣式來做設定,目前是支持到 CSS 2.0。
其實 EPUB 對於內容與樣式的限制與原本自由且開放的原則有矛盾性存在:雖然個人化樣式是被允許的,但規範中同時要求跨平台呈現的一致性,也因為對於閱讀器規格沒有嚴謹的規範,使每個閱讀器呈現同一電子書檔的方式會約略不同,個人化樣式也相對比較困難。
互動性與多媒體的嵌入
電子書與實體書有甚麼不同?當然因為形式是一個電子檔,流通性較高、增加可攜帶度,同時也因為不需要紙,應該較能獲得環保團體親徠。不過真正在電子書與實體書之間畫上界線的是數位文學的興起;Katherine Hayes 曾定義電子文學為「並非由電子化的平面媒體轉換而來,而是『與生俱來』的數位內容,也通常以電腦為傳播與閱讀平台的文學。」Underland Press 一家外國出版社進而用 "Wovel.",即「網頁 (Web)」與「小說 (Novel)」兩字組合成的新詞描述這種新文學形態。
雖然 EPUB 目前不支持影音內嵌與互動性內容,但未來標準可望會讓閱讀器提供相當好的支援,這樣才與傳統書本電子化有所區分,創造一種新的互動式媒體、網路文學,讓電子書的內容更加豐富多元。
出版與作者資訊
之前曾提及,EPUB 的低技術門檻使版權與 DRM 相關問題浮現。如何讓出版資訊能被各種閱讀器讀取、編譯還尚待解決。
EPUB 目前是使用 Dublin Core Metadata Initiative 來標示出版資訊,其元素是用 "dc:" 做為開頭以顯示它是 Dublin Core 的元素。Dublin Core 使出版資訊標示變得更容易,不過它疏鬆的內容規範使書與書之間的標示不一,一本平面書可能有獨有的 ISBN,電子書卻是沒有一個公定的 ID 標示方法,版權資訊的標示也沒有固定模式,使各種閱讀器無從讀取或是理解電子書相關資訊。
雖說現在電子書出版資訊的標示仍然沒有一定規範,隨著語意網技術的成長與發展,使用者可以用 W3C 設計的 Resource Description Framework (RDF) 來更明確標示文件中的出版資訊。
國際化以及跨文化議題
EPUB 對於出版業是一種轉機,提供許多新可能性,不過目前它對於跨文化,或是非西方文化的文件出版,尤其是亞洲文化的支援仍然相當不足,很多人希望新的規範中能增加如直排書寫、Ruby 註解支援等等,這樣不只是出版商,同時文化保存、數位典藏相關工作也會因而受益。
結語
有鑑於電腦與網路技術不斷提升,電子出版的重要性也隨之提高。儘管至今仍有許多議題尚待討論,但望及 HTML5 與 CSS3 的進步,各種例如 DocBook、 TEI、DITA 的電子文件標示語言發展,以及電子書技術趨於成熟,一個創意、開放、自由的新網路媒體讓人相當期待!
EPUB: A Stage for Free, Open, and Creative Media
Hi! My name is Hui-Yin, I am a summer intern at Creative Commons Taiwan. I am currently in my senior year in the English department at National Chengchi University, and I have a minor in computer science. I have an interest in literary computing as well as electronic literature, thus I have been studying ebook and document markup this summer.
Introduction
In a lunch session with the Washington Post, Microsoft CEO Steve Ballmer said an alarming statement regarding the future of the publishing industry: that printed media will disappear within ten years and all content will be delivered electronically over the internet.
Only two years away from his prediction, the publishing industry has turned upside down. In the second quarter of 2010, sales of e-books on Amazon for Kindle surpassed sales of hardcover books by 143 e-books for every 100 hardcover books sold. Apple, Sony, Amazon, Microsoft are all embedding e-reading technology into their mobile devices. To cap all, on July 28, 2010, Taiwan's first internet e-book store called PUBU was established.
So what are e-books? What impact will they bring to printed media? How will they stimulate creativity, open standards and free thinking? Will they also cease to exist in a few years if technology continues to advance? I hope to look into some of these questions, but some I shall leave unaddressed for further speculation.
Introduction to IDPF and EPUB
To talk about ebooks without talking first about EPUB would seem foolish. EPUB is the official standard for e-publishing set by the International Digital Publishing Forum (IDPF) in 2007, replacing the original Open eBook standards. Most ebook readers today, with an exception of Kindle, accept the EPUB format.
The newest version of EPUB is EPUB 2.0.1 which came out July 2, 2010. Version 2.1 is scheduled for release in January 2011.
The purpose of EPUB is to provide a free and open e-book standard which can display text and searchable graphic images, have good compatibility between readers, and provide DRM and metadata embedding support.
There are mainly three components in the EPUB specification: the Open Publication Structure (OPS), Open Packaging Format (OPF), and OEBPS Container Format (OCF). Their relationship may be described in a diagram:
The awesome thing about EPUB standards is that it's, in a word, a group of webpages zipped into a file. One wouldn't need profane knowledge about programming languages or technical aspects to create and distrubute their own epub book. Of course, this raises the issue of Digital Rights Management and copyright, but doubtless, the low publication bar will encourage a wide distribution of textual content over the website, igniting a flame of creativity.
Textual Content and Styling
The content and styling constraints of EPUB documents are specified in the OPS specification. The content is structuralized using a subset of XHTML that removes all interactive content. The layout of the e-document is decided by the CSS stylesheet, which in EPUB should support CSS 2.0.
The paradox of this publication structure is that it allows personalized styling and layout, yet it asks for presentational consistency between different readers. However, it is true that various readers tend to render various texts differently, which makes personalized styling difficult, since it is rendered differently across readers.
Interactive Content and Media Support
What makes an e-book different from a physical book? Of course, with the form of a electronic file, it is much more portable, environmental, and can be widely distributed. But furthermore, it triggers the birth of electronic literature, which defined by Katherine Hayes exclude print literature that has been digitized, is by contrast 'digital born,' and (usually) meant to be read on a computer. Underland Press calls this a web novel, or a "Wovel."
Although EPUB has not yet supported video and audio embedding nor interactive coding, it is hopeful future standards and readers will provide such support. In a sense, an ebook is no longer merely the digitalizaing of a physical book, but text media for the internet, a new media form that combines all sorts of visual entertainment in one.
Metadata and Bibliographic Information
As mentioned previously, the ease of digital publication by EPUB standards brings the question of copyright and DRM into light. How copyright and metadata can be standardized for all machines to understand and interpret is a difficult question.
EPUB adopts the Dublin Core Metadata Initiative for presenting metadata. The prefix "dc:" is added to its elements to indicate a dublin core element. It provides a simple way to markup metadata. However, it raises as many difficulties as it resolves: there is not a standard way for expressing copyright information. One could simply mark a work as "CC-BY-NC" as well as other copyright statement. It would be impossible to ask browsers to interpret such metadata information.
A possible, though still immature, solution may be found with the development of the semantic web. The Resource Description Framework (RDF) designed by W3C serves the purpose of marking up the metadata in a document.
Internationalization and Cross-Culture Issues
EPUB offers many opportunities to the publishing industry. However, support for cross culture publication, especially for Asian languages such as Japanese is not sufficiently supported. Many hope that the new specifications will allow such features as vertical text and provide Ruby support for annotating. This is not only for the benefit of the publishing industry, but also for cultural preservation in digital archiving.
Speculation
With the constant improving of computer and internet technologies, the importance of e-documentation becomes more and more significant. Currently, HTML5 and CSS3 are under development, various document markup languages such as DocBook, TEI, DITA are flourishing across the World Wide Web. The improvement of ebook technology and the prospects of a creative, open, and free internet media is definitely exciting.