TECHNOLOGY FOR RAPID EXCHANGE OF SCIENTIFIC INFORMATION VIA THE INTERNET

R. Daniel Lineberger
Professor of Horticulture
Department of Horticultural Sciences
Texas A&M University
College Station, Texas, USA 77843

An invited presentation delivered by the author at the International Symposium on Strategies for Market Oriented Greenhouse Production, 11 - 15 March 1995, Alexandria, Egypt. Published in the proceedings of the conference as Acta Horticulturae 434:407-412.

This contribution was peer-reviewed and has been revised according to the helpful comments of Dr. Tim Rhodus.

Summary
The World Wide Web is the most rapidly growing communication system in use today. Computers with network connections to the Internet can serve and receive files in a hypermedia format, with the result being a graphical user interface to text, photographs, sounds, and motion video. The Web links networked computers of all sizes and types through use of a hypermedia application known as a "browser." Browsers are available for Windows, Macintosh, UNIX, NEXT, and Linux operating systems. Most of the software needed to serve and receive information using this technology is free to educational and non-profit users such as consumers, and readily available by FTP file transfer from many sites on the Internet. Businesses may need to pay a fee to use the software.

Hypermedia technology allows research-based information related to protected cropping to be disseminated world-wide rapidly and cheaply, and to audiences that previously had difficulty accessing the information through scholarly journals. There are many applications of the World Wide Web appropriate to multinational research projects. Databases of the participants containing photographs and biographical data could be established to assist the formation of joint research programs. Research results could be published on the Web rapidly, and these data and findings can be viewed by other project participants, project administrators, funding agencies, and those who directly benefit from the information, e.g., farmers, growers, retailers, educators, and consumers.

Additional Index Words
Internet, information server, gopher, multimedia, computer technology

1. Introduction
The goal of any scientific communication medium is to facilitate the rapid and accurate transmission of information. Most media are constrained additionally to offer this information exchange at a minimal cost, since the money available to scientists for purchase of books, journals, and encyclopedias is limited. Traditional media include books, scholarly journals, trade and industry publications, and newsletters. Many scientists are no longer able to afford to purchase individual subscriptions to all the scholarly reference works needed to stay current in their fields. Journal subscription rates increase yearly, and new journals are added to store the ever increasing volume of scientific information available. This problem is compounded by the fact that university and research center libraries are decreasing the breadth of their subscriptions because of escalating publishing costs and proliferation of the number of journals.

In addition to the obvious economic limitations to assembling the "ideal" personal reference library, the pitfalls of depending on institutional or governmental libraries to meet information needs are equally obvious. Libraries typically purchase few copies of any given work, and in most active research fields, competition for information resources is keen. The library upon which one depends may not place a high priority a given field, necessitating interlibrary loans which by their nature are quite time consuming. Furthermore, institutional and governmental libraries serve the public very poorly if at all. Public libraries and secondary school libraries carry few periodical-type sources, and consumers and students often have no source for scientific information other than direct contact with researchers and university faculty.

It was in the context of a tremendous need for rapid access to information of all types in a cost effective manner that the World Wide Web emerged. From its inception in 1989 as a tool for sharing research communications among the European physics community, the Web has grown exponentially. Web servers now number in the thousands, and essentially connect all the geographic area served by the Internet into a global information base. Scientific, governmental, and recreational information servers predominate but private enterprise is adopting the technology as a sales and marketing tool at a rapid rate.

The World Wide Web is the interconnected network of information-serving computers that send and receive information in a common format known as "hypertext." The World Wide Web uses the Internet for electronic connectivity, but the Web is not synonymous with the Internet. Figure 1 graphically illustrates the organization of the World Wide Web and its relationship to the Internet. Hypertext document transfer is only one of many types of file transfer on the Internet. The Internet is used for transferring electronic mail, financial transactions, data and binary application files, and other types of document files including gopher and newsgroup files. The hypertext documents are constructed to an international standard known as "hypertext mark up language" (HTML). HTML documents can reference or connect other files containing high quality color images, audio files, compressed motion video, and can link to other documents or files on the same server, or to any other server on the Internet.

2. System Configuration
Because HTML is not platform specific, HTML files can be transferred between micro-, mini- and mainframe computers. The basic unit of transfer is the HTML document, a text file in ASCII format containing codes that control document formatting on the receiving computer. The uniform formatting codes characteristic to HTML govern the formatting features of the documents when they are displayed on the receiving (client) computers (the character string <b>bold</b> would appear as bold on the receiving computer, for example). The codes allow image files to be displayed within documents, and importantly, make links to other computers and other files when special codes called anchors are clicked on. Creating HTML files is learned easily, and instructions for doing HTML markup are available from many sources on the Web.

2.1 Server Software
Server software is not required to view (or browse) information stored on the Web. It is required only to serve information from a Web site, and its discussion here is meant only to demonstrate how the server-client system operates. It is recognized that not all who obtain information from the Web are able to serve information as well. The software that sends files over the Internet from the serving computer is called an "HTTP" (hypertext transfer protocol) server application. HTTPd is a UNIX server application, and MacHTTP is an HTTP server for the Macintosh operating system. HTTP applications are generally RAM-efficient (the Macintosh version for example uses only 754 kb of RAM). They can run "in the background" on a desktop computer without adversely affecting the operations of that computer unless the server is an extremely active one. This feature simplifies the construction and storage of HTML documents, and allows an individual with only one computer to be an independent Web server site. Setting up a server without a direct ethernet connection to the Internet is more difficult, and the reader is advised to investigate working with an Internet service provider in that situation.

2.2 Client/Browser Software
The applications that display the HTML files on the client computer are referred to as Web "browsers." Browsers are a graphical user interface to the Web, and they necessitate an operating system capable of multimedia type displays (Windows, Macintosh, UNIX, etc.). The browsers are truly multimedia applications. Within the same document, one can display formatted text, high resolution color images, sounds, and even motion video. The text and images can then be printed to a device connected to the client computer, such as a laser printer, resulting in high quality printed documents using fonts specified by the client computer or saved to disk and incorporated into other application software.

Browser software approximates the "look and feel" of using a CD-ROM. In this case, however, instead of being limited to the data on a CD-ROM, the user is simultaneously connected to any other computer on the World Wide Web. Moving from page to page, from computer to computer, if you will, is as simple as clicking the mouse on any text that is highlighted (that is, an anchor linking to another file or computer), or in some cases, on a given spot on a special type of image called a "clickable image" that contains hidden links to other files.

Browser software also views text files stored on gopher servers and can print those files to the user's local printer. Browser software can be used in the "local only" mode, fetching and viewing files from the computer upon which it resides. This feature allows for the creation of multimedia documents that can be viewed without an Internet connection. Browsers also provide access to Internet newsgroups, gopher servers, and permit FTP file exchange.

As of this writing, the most widely used Internet browser application is Mosaic, a product developed and updated frequently by the National Center for Supercomputing Applications (NCSA) at the University of Illinois. Mosaic is available in versions for most computer operating systems. Mosaic is free, and can be downloaded using telnet or a similar application with ftp (file transfer protocol) capability from the server at the following Internet address: ftp.ncsa.uiuc.edu. The application is in the directory /Mosaic. The image viewer and sound enabling helper applications and instructions for installing and configuring the software are also available from the same site. The readme files that accompany the software are important for the initial setup of the program.

Netscape Navigator 1.0 has all the same compatibility features, and operates quite similarly to Mosaic. Netscape is a bit faster, has many image viewing capabilities built-in, and has more advanced text formatting features than does Mosaic. Netscape represents a second stage in a rapidly evolution in the quality, versatility, and ease of use of the Internet browsers. Netscape Navigator 1.0 is available by ftp file transfer from the server at: www.mcom.com.

Microsoft Corporation has announced that the next version of the Windows software (currently named Windows 95) will have Internet browsing software built-in, so that the user connected to the Internet via Ethernet connection or high speed modem will have "one button" access to the World Wide Web. The IBM operating system OS2-Warp already incorporates an Internet interface and file exchange protocols.

3. Applications
The World Wide Web is the most rapidly growing medium for information exchange throughout the world. Since most of the world is served already by Internet (parts of Africa and Asia being the exceptions), the development of the Web, and its increase in use by the nonacademic community was a logical and predictable outcome. The Internet is used most heavily on a percentage basis by the academic community in the United States (Hughes, 1994). (Author's note: The address for obtaining the document prepared by Kevin Hughes is given in the bibliography. It is an excellent, well-written but brief publication that describes the origin and development of the World Wide Web and suggests several sites for initial exploration. The effort needed to obtain a copy is an excellent investment of time!)

3.1 Exchange of Scientific Information
The World Wide Web (or an advanced version of it) eventually will replace books and journals as the primary medium for exchange of scientific information. The libraries of the future will serve as electronic repositories of information. Librarians will assist researchers, teachers, and the public in electronic information retrieval, rather than serve as guardians and catalogers of printed material. Many journals are available in electronic form today, and many more are developing plans to go on-line. Most of the information currently available in printed form will be placed in an electronic form, although it is unclear who will take responsibility for doing so. Present and rapidly developing information has been given a priority by those individuals and corporations developing servers for the World Wide Web, and archival information already in libraries is a lesser priority.

The many reasons that scientific information exchange is evolving toward the World Wide Web as a delivery tool are readily deduced from the following features: