Web Technologies and Standards

Michael B. Spring
IS 2870: Web Technologies and Standards
Department of Information Science and Telecommunications
University of Pittsburgh
Spring, 2006 -- 06-2

 

Introduction

Standards are quantifiable metrics to which parties adhere for purposes of allowing some common ground for interchange. Some students of standards view monetary systems developed for the exchange of goods as the earliest standards. The alphabet may be viewed as one of the earliest standards - a compatibility standard for the exchange of information.  Modern U.S. standards first appear in the manufacturing arena. The history is somewhat cloudy and many stories are told, but in most of them, mass production and the railroads play a role. The railroads required standardization on many fronts, from track gauge to time.

 

Overview

This course is an introduction to web technologies standards.  It spends some time setting the stage by looking more generally at standardization in the information technology industry. The course will explore the creation, modification, adoption, and maintenance of standards. It examines a wide spectrum of standards, focusing on standards related to computing and information in electronic form.

 

The standards that will be examined in this course are limited to those in the information technology arena. Within the realm of information technology standards, we still look primarily at the higher level software and information formatting standards. While hardware standards are important they are generally addressed in other courses in the curriculum. Less well addressed, but of increasing prominence, are the software and information formatting standards. They will be the focus of our attention here.

 

Network protocols will include addressing standards (IP), transport standards (TCP) as well as their related standards – e.g. ARP, Ethernet, and DNS.  Applications standards (SMTP, SNMP, FTP, TELNET, HTTP etc.) will be reviewed with a focus on those pertinent to current efforts – SMTP and HTTP currently.  The course will trace the history of distributed object standards such as CORBA, and DCOM concluding with the development of infrastructure standards for distributed services such as IDLs (including WSDL), directory service standards such as LDAP and UDDI and security standards such as DES and PKI.

 

Data standards will include and introduction to DBMS standards such as SQL and ODBC.  File formats for a selected set of file types will be reviewed (e.g. PNG, PDF, UNICODE, etc.)  The XML standards will be reviewed.  Particular attention will be paid to XML data types, XML Schema, XSLT, XPATH and XQuery. Derivative standards such as the Resource Description Framework and the RDF Schema language RSS and VRML, etc. will be introduced. The course will provide students with the opportunity to explore these standards in the course of real application development.

 

The Course

Philosophy and Approach

IS 2870 is a graduate course. This means students have responsibility to be proactive in their learning. The instructor's role is less directive and more one of stimulating and guiding learning. If you think back to the things you remember from other courses you have taken, you will probably find that the things you remember best are the things you had to work hard to learn. What takes place in the classroom constitutes only a small part of the overall learning in any course. To this end, the course is constructed to provide three kinds of learning experiences. First, class lectures and discussions that help to clarify the concepts and principles in operation. Second, readings that both provide an overview and some depth in discussion of the topic. Third, practical experiences related to the subject matter at hand.

 

The most important learning will come from the efforts you undertake beyond the textbook and class lectures. The lectures will provide a counterpoint to what is in the book and the assignments will require the use of skills learned in this course along with the many other skills you have developed throughout your program. The lectures for the course will cover a broad range of topics in an effort to provide both orientation and understanding about basic concepts. Finally, the assignments will provide an opportunity to see, in relatively simple situations, how the concepts discussed in the lectures and readings are implemented in practice.

Expectations about Preparation

At a very minimum, it is expected that you will read any material assigned to you before the class for which it is assigned. This does not mean skimming the material - it means reading, annotating, and understanding the material. It also means that it is your responsibility to identify areas in which you need to do extra work to bring yourself up to basic competency in the areas we will cover. Any course in this school brings together a diverse group of people with vastly different experiences. This makes it difficult to know where to start in a multifaceted course like this one. While the discussion of standards in the first part of the course is fairly well self contained, the latter parts of the course will require a fundamental understanding of telecommunication and computer systems. The basic knowledge and the associated skills developed in data structures, networking, and information systems provide a common starting point for our discussion. Should you find yourself totally ignorant in these areas, you are encouraged to do some preliminary reading, skill enhancement, or thinking during the first few weeks of the course. It is important that you understand that what you take out of the course will to a large extent be determined by what you bring to it.

 

While there is no accepted or standard programming language or operating system, there is a growing trend to establish such. For the last decade, many viewed the Unix operating system as a standard toward which we should be moving. Because of the close association between Unix and the C programming language, which has been used extensively for interactive system programming, some saw C and C++ as emerging language standards. More recently NT and Windows 95 have begun to emerge as operating system platform standards. In conjunction with the explosive growth of the world wide web, Java has begun to gain in popularity as a language of choice. While work may be done on any platform and in any scientific language, you are encouraged to use either Unix or NT/Win95 operating systems and the C or Java languages for programming work you do in this course.

 

Goals

The goals of the course are:

·       To define the basic characteristics of standards.

·       To explore the processes by which standards are developed.

·       To review the impact of standards on information systems.

·       To experience the process of designing/programming information systems in accord with some standard.

·       To understand the scope of information technology standards in the data, communications, operating system, and human-computer interface areas.

·       To analyze the functional needs for standards for enterprise wide computing systems.

 

Course Materials

The required textbook for this course is:

 

Students may also wish to read:

·       Scaffolding the New Web: Standards and Standards Policy for the Digital Economy
by James Schneider, et al (Paperback - June 2000), RAND Corporation (June, 2000). ISBN: 0833028588 

·       (Copies will be provided by the instructor) Global Standards: Building Blocks for the Future, Linda Garcia, Office of Technology Assessment

·       Charles F. Goldfarb's XML Handbook, Fifth Edition Charles F. Goldfarb, Paul Prescod Prentice Hall PTR; 5 edition (November 3, 2003), ISBN: 0130497657

·       "Information Technology Standards." In M.E. Williams (ed.) Annual Review of Information Science and Technology. Volume 26.

·       Journal of the American Society for Information Science, Volume 43, No. 8, September, 1992, which is devoted to Information Technology Standards.

·       Standards Policy for Information Infrastructure, Brian Kahnin and Janet Abbate (Eds.) MIT Press, Cambridge Mass, 1996.

 

Course Requirements

Students are expected to come to class prepared to discuss the topic assigned for the day. In addition, students are expected to be doing their own exploratory reading on related subjects throughout the term. As indicated, the instructor believes that the knowledge and skills you take away from a course come not only from what the instructor espouses in class, but from your external readings and your own work and writing.

The instructor reserves the right to modify the course requirements. In particular, if deemed necessary, exams may be added to the requirements or substituted for selected requirements if in the instructor's opinion students are not staying abreast of course readings.

In addition to readings and class participation (accounting for 10 points of the grade awarded by the instructor), students are required to complete 90 points of project activities. There are 50 points worth of activities to be undertaken by all students. Beyond these minimum requirements, students may choose from the optional activities listed below or they may propose new activities (in writing to be reviewed and if appropriate approved by the instructor.) Proposed activities may be individual or group, but care should be taken to insure that the scope of group activities are significantly larger than individual projects.

Required activities:

·       (10 points) Compose an XML schema and a sample document for one of the following document types:

·       Syllabus

·       Report

·       Dissertation

·       Resume

·        (20 points) Carve out one area of current W3C standardization activity and develop an assessment of the state of the standardization effort. Your report should address what is being standardized and how it is being standardized. In addressing what is being standardized, you will need to articulate the motivation for standardization, what is being addressed, and what is being left out. You will also need to try to understand the process that is being used, who is involved, and what is going on. (This requires that measures of good and bad standards and standardization processes be identified.) The results of your analysis should be so structured that they might be used by graduate students and professionals to better understand the situation and to use what you uncover to plan strategic directions for their companies. The report may form the preparatory work for a research paper on the standards process that would look at one or another of the issues listed below:

·       Select one critical factor that impacts standards development from the literature on standards development - i.e. the impact of small group behavior, or collaborative writing on the process and try to measure the impact.

·       Conduct a thorough literature review on the technical issues being addressed

·       Survey a community of IT participants to validate the goal of a standardization effort.

·       (20 points) Write program that interacts with a file that is formatted in accord with some standard. The program may read, write, or read and write a standard file format. Standard file formats include such as the following: Mail in accord with RFC 822, hypertext in accord with HTML, images in accord with TIFF, GIF, PNG, etc.  Any of the following would be acceptable.

·       a program that extracts metadata from an image file i.e. size, colors, comments, etc.

·       a program that opens a standard Internet mail file - compliant with RFC 822 and allows the user to browse, read, delete, and file existing mail messages as well as to create new or reply mail messages which are to be written out to a "to-send" file.

·       A program that extracts information about a web page and summarizes it

·       (20 points) Write a spider that collects data about a website or an RSS aggregator that collects and displays RSS feeds. 

Major Project Suggestions

The activities below are simply illustrative of those suggested for the class. They are far from exhaustive, and are intended only to stimulate thinking about the kinds of projects students might undertake.

·       Develop an analysis of the significant issues in standardization that must be addressed for the National Information Infrastructure to be developed.

·       Develop an automated RDF tool to catalog the information on a website.

 

Assignment Due Dates and Grading

The submission of required activities are all due by week 8. Late assignments will be subject to a 10% penalty.

Grades for all the assignments will be summed and from 0-10 points will be added by the instructor based on an assessment of class participation. Final grades will be assigned as follows:

Grade

Points

A

100-90

B

80-89

C

60-79

F

0-59

 

Course Overview

 

Schedule of Topics and Readings

This course is addresses a subject which has not traditionally been a part of the information science curriculum. The topics are interdisciplinary, broad, deep, and complex. From the academic perspective there is not any comprehensive perspective or model from which the subject may be suitably studied. For these reasons, students should anticipate that the sequence of topics may be adjusted, expanded, or contracted as the term proceeds.

 

In general, the course will be divided into four parts.

·       In part one, standards and the standardization process will be examined conceptually. Models will be examined and the research will be reviewed. The various organizations involved in standards development will be examined, and the process for consensus standards development will be explored.

·       In part two, we will look at the XML family of standards in depth, particularly as they relate to web technologies and developments

·       In part three, several standards will be explored in four broad categories:

·       Communication Standards

·       Data Interchange Standards

·       In part four, we will return to the standardization process and look at it in the broader context of the electronic enterprise, software development broadly, and globalization.

 

Introduction and Overview

·       Introduction to the course

·       Review of the course syllabus

·       Discussion of the assignments

·       The use of CASCADE

·       The purpose of standards

·       The history of standards development

·       nationally (in the U.S. and in other countries)

·       internationally

·       Approaches to the Study of Standards

·       Economic

·       Technical

·       Policy Studies

·       The Study of Standards

·       Broad Categories

·       Internal corporate standard

·       De-facto standard

·       De-jure and client determined standards

·       Safety standards

·       National security standards (highway standards)

·       Military standards

·       DIF, CALS

·       TCP/IP, Unix

·       Industrial Combine Standards

·       MAP and General Motors

·       Boeing and TOP

·       FIPS 195 and GOSIP

·       Formal consensus standards

·       Standards in typical information transactions

·       Hardware Standards

·       Terminal design

·       The communications interface

·       LAN (local area network) cabling

·       Data Interchange Standards

·       Character sets

·       Escape sequences and control characters

·       LAN protocol standards

·       Software Standards

·       Operating System Standards

·       Programming language

·       Application Standards

Frameworks for Understanding Standards

·       Classificatory Models

·       Taxonomies

·       Economic Models

·       Impact on research and development

·       Impact on end user purchases

·       Cost of the standards themselves

·       Impact of technological developments

·       Operational Models -- OSI

·       Organizational Models

 

W3C and Standards

·       Accreditation of a developer

·       Planning and coordination of standards

·       Designation, publication, maintenance, and interpretation of standards

·       Procedures for development of a standard by a committee

·       Organization of the committee

·       Responsibilities, officers and membership

·       Subgroups of the committee

·       Meetings

·       Voting

·       Submission

·       Communications

·       Termination of the committee

·       Appeals

·       Maintenance

·       Role of consortia

 

XML Standards

·       Particular attention will be paid to XML data types, XML Schema, XSLT, XPATH and XQuery. Derivative standards such as the Resource Description Framework and the RDF Schema language RSS and VRML, etc. will be introduced. The course will provide students with the opportunity to explore these standards in the course of real application development.

 

Communication Standards

Lecture

·       OSI and TCP/IP

·       The Internet Community -- Internet versus OSI

·       Network protocols

·       addressing standards (IP)

·       transport standards (TCP) as well as their related standards

·       ARP

·       Ethernet

·       DNS

·       Applications standards

·       SMTP

·       SNMP

·       FTP

·       TELNET

·       HTTP

·       distributed object standards such as CORBA, and DCOM

·       infrastructure standards for distributed services such as IDLs (including WSDL)

·       directory service standards such as LDAP and UDDI and

·       security standards such as DES and PKI.

 

Data Interchange Standards

Lecture

·       Revisable Form Document Exchange Standards

·       ODA/ODIF

·       SGML/XML

·       Final Form Document Exchange Standards: Page Description Languages

·       Postscript

·       Interpress

·       SPDL

·       Other Data Interchange Standards

·       PDES

·       EDI

·       DBMS standards

·       SQL

·       ODBC. 

·       File formats

·       PNG

·       PDF

·       UNICODE, etc.) 

 

The Importance of Standards in the Global Economy

Readings

·       Global Standards: Building Blocks for the Future.

Lecture

Be prepared for a DISCUSSION during this class session of what you have read. If possible, we will discuss the reading with the primary author, D. Linda Garcia of the Office of Technology Assessment.

 

Standards Issues

·       Professional Issues

·       The Communications Act of 1934 -- Standards and information access

·       Privacy, security, and freedoms

·       Legal issues

·       Liability

·       Liability of the standards groups

·       Liability of the validating organization

·       Liability of the vendor

·       Intellectual Property

·       Political issues in standards development

·       Capitalism versus the Social Good

·       Professional associations: Engineers versus Producers versus Users

·       Self Certification versus third party certification

·       National governments and international organizations