Deutsch English
Home
About tdbengine
Newsletter
Download
Helpware
Forum
Chat
Documentation
Basic Course
Basics
Programmierumgebung
CGI Aufbereitung
EASY Programmierung
Standard-Bibliothek
Die Datenbank
HTML-Formulare
Function reference
HOWTO...?
Projects
Links
Benchmarks
Bug Reporting
Support request
 
Home    Overview    Search    Impressum    Contact    Members
Lesson 1: Basics / What is CGI?
Client Side - Server Side
Always if functions exceeding the HTML standard are needed an extra program capacity are be needed. In doing so first you'll have to differ where the program will be executed.

If the program's running on the client side (at the user's computer) so you'll have to insert the analogical source code into the HTML document. Examples therefor are Javascript, Java-Applets and ActiveX components. But because of the safty aspects the possibilities of this programs are confined a lot. They will be executed in an absolute secured ambit (for example: Java-Standbox) and they aren't allowed to run a system function like installing or deleting files. The only informations the imbedded programs are allowed to access are the elements of the (embedding) HTML documents.

Example: A shopping system based on Java-Applets has to transfer the whole inventory with the external HTML document. Furthermore an imbedded program has to be presented in a cross platform form to limit the use to only few operating systems (and/or CPUs). In this aspect only Java and Javascript remain: In the case of Javascript the source code of the program will be transfered  in Java it's a portierable pseudo-code(byte-code). In either case a further program at client side has first to interprete the code and then to run it.
It will only be possible to work in the wanted degree if the interpreters of divers user's deliver equal (or at least alike) result. Unfortunately now it isn't so. Anyway client programs are always very attractive because you can better shortchange the resources in net.

Progeramms running on a server have no such limit. Here the security question is asked different. A course belov gives some answears to the relevant questions. According to it server programs can access (within a specific limit) the manyfold system resources:

  • Database
  • Data
  • Communication agent (like E-Mail-shipping)
  • Further programs

Furthermore the program can have any form the server can run. So there is no need to make the code portierable but the programmer can completely exhaust the computer. The programming language which the the programs were written in doesn't figure.
The reason why literature often tells about CGI scripts is only that the most CGI programs were written in a language like Perl. The kind of languages analysing, interpreting and runing source code only at runtime is called script landueges.

Server programs get problems always when many inquiries have to be responded because (mostly) there is only one computer whose capacity has to be splitet up between the inquiries. Furthermore the result of any program has to re returned so there'll probably be an extensive net load..

Here some features, pros and cons of client and server programs are listed:

Client programs:

Pros
: The program can use the resources of the client computer so the server will be discharged. It might be an edge with reckon intensive programs. Furthermore the graphic and sound functions of the visitor's computer can be better shortchanged. To interact  with user no further transfer is needed (Example: Client can be directly reminded of faulty insertions and correct them)

Cons
: All needed informationes (inclusive the program) have to be transfered first. The program isn't allowed to save data so it has  no ongoing storage it means it has no "memories". Database query are always linked with transfers from server. The interpreters of the progranns are in parts incompatible.

Application area: Client programs are now predominantly used for graphical design (OnMouseOver...) sometimes even forms are  preedited. But it's possible (and desireable) to create really functional GUIs (graphical user interface) using them you can access server programs. (Unfortunately Java programming isn't as easy as HTML design)

Server programs:

Pros: The program can access databases, read and write data. It's running in a defined environment and the client environments are irrelevant. It delivers the result to all clients who can use HTML from host system across all PC's to the point of WAP mobile phones. The resources of the host computer can be used efficiently.

Cons: The resources of the host computer are limited (even if it is a high-capacity system) that's why it would come to a performance collapse if there were many enquiries at the same time. The dialogue allways requires transfer of informations charging the net (mainly if the answears have to be in "good graphic").

Application area: All programs that need to access databases or any other central datasets.

Was ist CGI?
CGI stands for Common Gateway Interfaceand defines a standard of information transfer between a request and an application executing the request whereat the both sides of the http protocol (http=hypertext transfer protocol) help oneself.

Usually the information stream proves to be like this: Using a http browser (Netscape Navgator, Microsoft Internet Explorer) a client sends a request in form of a name of a file to the http server. Mainly it's a HTML side that the http server can send to the client directly.
But by use of some accolades of the request http server catchs that the request should be relaied to an other program. Usually this accolades can be directly catched in the addressing of the request (that means in the filename or the complete path to the filename):
Components will be extracted from an address like http://www.tdb-engine.de/demos/hello.pl apart:

http the protocol (and so the analogical server)
//www.tdb-engine.de the network server belonging to this domain (IP address)
/demos/ the virtual path to the wanted file on the server
hello.pl the wanted file

The http server now affects the class of the program

a) using the extension of the wanted program's name

Examples:  

  
 .pl   Perl programs 
 .prg   tdbengine programs 
 .cgi   Unix-Shell scripts 

b) using the directory the wanted file is in

Examples:  

  
 /cgi-bin/   Standard CGI directory on Unix/Linux computers 
 /scripts/   Standard CGI directory on Windows NT 
 /cgi-tdb/   favoured directory for tdbengine programs 

Which of this both external program calls the computer will catch and which not is defined in the analogical configuration of the server.

                      Static and dynamic http requests the server can reply immediately by transfering the requesed files to the client without accessing other program resources are called static requests..

Predominantly that are

HTML documents .htm, .html
Text documents .txt
Pictures .gif, .jpg, .jepg
Sounds .mp3, .wav, .au

Requests can be only replied using other programs are called dynamic requests. That are amongst others

CGI programs (see above)
ISAPI applications .isa (aren't discussed here)
NSAPI applications .nsa (aren't discussed here)

There is a hybrid on all prevalent http servers in which dynamic topics are embedded in static HTML documents. This embeddings are so-called "server side includes", you can detect external HTML ducuments by their name extensions like .shtm or .shtml. Server side includes aren't the theme of this basic course.

The data can be transfered like this
In either case first the http server receives orders. If the server detects a call of a CGI program it will create an environment. It's made up of
  • a set of environment variables
  • a set of in- and output channels

Environment variables are nothing but callable strings with fixed names. Even shell (command prozessor) has a set of environment variables you can see using the "set" command. Thereto switch to a terminal (MS-DOS prompt on Windows) and enter following:

set [RETURN]

You'll get a list of all environment variables of youre shell (here an example):
 

  
TMP=C:\WINDOWS\TEMP
TEMP=C:\WINDOWS\TEMP
PROMPT=$p$g
winbootdir=C:\WINDOWS
COMSPEC=C:\WINDOWS\COMMAND.COM
PATH=C:\WINDOWS;C:\WINDOWS\COMMAND;C:\PP\BIN\GO32V2
windir=C:\WINDOWS
BLASTER=A220 I5 D1 T4
...

Or on Linux 

  
BASH=/bin/bash
BASH_VERSINFO=([0]="2" [1]="03" [2]="0" [3]="1" [4]="release" [5]="i686-pc-linux-gnu")
BASH_VERSION='2.03.0(1)-release'
COLORTERM=1
COLUMNS=80
ENLIGHTENMENT_ROOT=/usr/X11R6/lib/X11/enlightenment
EUID=0
GNOMEDIR=/opt/gnome
GS_FONTPATH=/usr/share/lilypond/afm:/usr/share/lilypond/pfa
HISTCONTROL=ignoredups
HISTFILE=/root/.bash_history
HISTFILESIZE=500
HISTSIZE=500
HOME=/home/tdbengine
HOSTNAME=uli
HOSTTYPE=i686
...

The environtment variables are very favoured by information transfers from one program to an other (alike the favoured in many programs clipboard) because programs both read and set a variable like that.

The http server also creates a set of environtment variables whereat the informations transfered by client will be considered. Accessorily the http server inform the program that has to be called about its attributes using the environment variables.

The CGI envionment variables
Notice: You haven't to internalise the following list because on the one hand tdbengine competely prepares the most basic variables  for you and on the other hand you'll be able to see them at anytime online if you need them.
CGI specification says that at least following environment variables has to be created:

Server specific environment variables


GATEWAY_INTERFACE
In this environment variable is the revision of the CGI specification this server supports.
Format: CGI/<revision>

SERVER_NAME
The name of computer the server software is running on is in the SERVER_NAME variable.
The alligation occurs as the hostname of the server, as the DNS alias or as the IP address. For example: www.cs.tu-berlin.de

SERVER_SOFTWARE
This variable contains the name and the version of the WWW server caused the run of the CGI script.
Format: <name>/<version>

DOCUMENT_ROOT
This variable contains the pathname of the documentation directory of the WWW server such as it's specificated in the configurations of the server.
For example: /usr/local/www/doc

Request specificated environment variables
The values of the environment variables in this chapter are request specificated. They will be made dependent to the server the request was turned to.

AUTH_TYPE
In secured scripts this variable informs about the autentication method to use.
For example: Basic

CONTENT_LENGTH
In METHOD="PUT" or "POST"  CONTENT_LENGTH contains the length of the available data in bytes boasted by client.
In METHOD="GET"  CONTENT_LENGTH is empty.

CONTENT_TYPE
This variable contains the allegation of the type of the file (MIME type) in the requests transfering data to server like HTTP claim PUT or POST.
For example (online form): application/x-www-form-urlencoded HTTP_ACCEPT This variable contains a list of MINE-Content-Types like quoted in the HTTP header that the client can understand.
The several elements are separeted with commas. Format: /, /, ...

HTTP_FROM
This variable contains the email of the user who caused the request. Not all browsers support the transfer of the user's email.

HTTP_REFERER
This variable contains the URL of the document the client asked for before referencing the CGI script.

HTTP_USER_AGENT
This variable provides information abut the client software (Netscape, Mosaic, ...) the CGI script war activated by..
For example: Mozilla/2.0 (Win16; I)

PATH_INFO
There are several alternatives to give parameters to the script when starting a CGI script.
One of them is adding this informationes to the URL referencing the script (separeted with commas '/').
Then this informations (inclusive leading '/') will be contained in the environment variable PATH_INFO.
Unfortunately this method of parameter passing is the most unstable and the impurest. This variable was originally thought to take the filename that comes after the virtual CGI script path. The access to a CGI script occurs using a virtual pathname (for example.: '/CGI/'). If now a script file is referenced  with the URL 'http:/<server>/CGI/datei' PATH_INFO will contain the value '/datei'. But this value is URL coded  it means that all peculiars were in the URL will be codeed. For example a + symbol displaces the space that is forbidden in the URL and so all spaces will be coded in a + symbol so it isn't possible anymore to differ between a '+' as space and the real plus.
So you should avoid parameter passing using PATH_INFO above all there is a more comortable method using the environment variable QUERY_STRING.

PATH_TRANSLATED
As aforementioned when specificating the PATH_INFO variable mainly was thought of using it for transfering filenames in it. But this filenames won't be of use if the place where the filesystem is isn't being transfered coevally, too. This work should take PATH_TRANSLATED. The server directs the content through it's mapping system and replaces all virtual path alliations with physical ones. So with '/file' as value of PATH_INFO and with mapping of '/*' in '/usr/local/WWW/pub/*' the variable PATH_TRANSLATED delivers the value '/usr/local/WWW/pub/datei'.

QUERY_STRING
The environment variable QUERY_STRING is set in one of the following three causes: 1.The call of the CGI script occurs out of a document that allows the entering of a search index using ISINDEX tags. The search index will be allocated to the environment variable QUERY_STRING. 2.The call of the script occurs out of a clickable (sensitive) inline-picture. In this case QUERY_STRING contains the coordinates of the mouse click in the picture. 3.The script is the addressee of data of an online form that were sent to the script using the "GET" method. In any of the three cases the WWW client adds a question mark followed by the paricular data to the URL referencing a script. In ISINDEX this data would be the search term, in sensitive pictures it would be mose coordinates and in an online forms it would be form data.

REMOTE_ADDR
This variable contains the IP address of the client computer.
For example: 130.149.18.37

REMOTE_HOST
The environment variable REMOTE_HOST contains name of the computer the request came from.
If the server hadn't this informations because the computer accessing has no domain entry this variable would be empty.
If it's so REMOTE_ADDR should be able to help weiter.
For example: quofum.cs.tu-berlin.de

REMOTE_IDENT
If an authentication server's running on the client system with RFC 931the WWW server will be able to find out the identifier of the client and transfer it to the CGI script into REMOTE_IDENT. You use this allegations with attention and use them for logging aim at best because they aren't believable in all cases.

REMOTE_USER
In use with identifier protected documents this variable gives the user name. It hasn't to be essentially identic with the UNIX user name.

REQUEST_METHOD
The method the request occured with can be found in the environment variable REQUEST_METHOD. The examples for HTTP as server protocol are "GET", "HEAD", "PUT", "POST" and so on.

SCRIPT_NAME
This environment variable contains the filename of the script inclusive the virtual path to it. This variable is mainly of use for script referancing itself because the scripts can't know that they are gettable on a virtual pathname.

SERVER_PORT
This variable contains the portnumber the request was sent to (generally: Port 80).

SERVER_PROTOCOL
The name and the version of the protocol the request to the server was made with can be found in the environment variable SERVER_PROTOCOL.
Format: <protocol>/<revision>

Only the following of them are important to a CGI programer:
SCRIPT_NAME which script has to be runned
PATH_EXPANDED the real path to the wanted file
QUERY_STRING the auxiliary informations have to be transfered using the URL

Notice:
The tdbengine edits this informations fully automated.

The In and Out channals
But this informations don't suffice. Particularly the CGI program has no possibility to inform the client about its result (output) that way.

It's possible that the CGI program changes one of the environment variables (and writes its output into it) whereupon the http server trasfers it to the client. But it would cause two serious disprofits: The disk space for the environment is limited so the returns of the CGI program would be limited, too. And so the http server would have to take over the transfer of the data and would be limited applicable.

That's why the CGI standatd sets aside that the output channal will be transfered to the client directly. The limit of the environment also  causes that an additional channal for greater information transfer from client to server will be created.

Consequently two information channals will be created at the start of a CGI program:

  • StdOut for information transfer from CGI program to the client
  • StdIn for information transfer from client to CGI program

But while StdOut is used often (a CGI program has to return something) StdIn is used only in specific cases.

get and post
Für die Übertragung von Informationen vom Klienten zum Server sieht der CGI-Standard zwei Methoden vor:

get

Here all informations are given over to URL. Thereby the additional informartions are separated from the original URL with question mark "?". Anything comming after the question mark the http server transfers to the environment QUERY_STRING. Apart Einzelne information components are separeted with the "&" symbol.
Example: http://www.tdb-engine.de/cgi-tdb.prg?command=read&page=main.
QUERY_STRING=command=read&page=main.

post

Here all informations are transfered into the StdIn channal of the server. From this it follows that a normal link <a href="..."> can use only the get method (because here only the URL will be transfered). Only in the forms you can choose which method has to be used: <form method="get"... -> get
<form method="post" -> post
Nitice: In the forms even using a mixed form is possible because on the one hand the URL for the call of the program will be given (action="...) so it can contain get addition and on the other hand the form fiels can be transfered using the "post" method.

You can transfer the form fields and their values to the server (and to the CGI program, too) using following:

  
<input type="text" name="xyz"> xyz=User's entry
<input type="hidden" name="xyz"> xyz=User's entry ()Eingabe des Benutzers (uncoded!)
<input type="checkbox" name="xyz" value="1"> xyz=1, if selected by user
<input type="radio" name="yxz" value="1"> xyz=1, if this option is selected
<input type="radio" name="xyz" value="2"> xyz=2, if this option is selected
<select name="xyz"> xyz=1, if this option is selected
<option value="1">
<option value="2">...
</select>
<select name="xyz" multiple> xyz=1&xyz=2.. if this options are selected
<option value="1">
<option value="2">...
</select>
<textarea name="xyz">...</textarea> xyz=content of the text
<input type="submit" name="xyz" value="done"> xyz=done if this switch is activated

Example:
In the following document is the following form

<form action="http://www.tdb-engine.de/cgi-tdb/savemail.prg" method="get">
E-Mail: <input type="text" name="email"><br>
Name: <input type="text" name="name"><br>
<input type="submit" name="done" value="send">
</form>

The user fills the both fields with "info@tdb-engine.de" and "Webmaster".
So the browser sends the following URL to the http server:

http://www.tdb-engine.de/cgi-tdb/savemail.prg?email=info@tdb-engine.de &name=Webmaster&done=send

If nothing else is given in the form the data entered by user will be given a special form called with url-encoding.
Many symbols musn't be used to transfer data using a normal URL (For example: space, umlauts, &/?...).

That's why symbols like that will be converted into the symbols allowed in URL before transfering. Since the browser deals with coding (at least in forms) and tdbengine(but not the http server) deals with decoding on the whole server sidedie let us not degrossing this theme.

You can see the coding for example using one of the search engines, entering a search key with some umlauts and looking at the resulting URL.

Notice: The extended protocol multipart/form data isn't dicussed here.

Context
In this lection the basic elements were discussed. Now you should know what a CGI program is, where does it run and how the information transfer between the client and the server works.

Challenges:
1. How does the http server know that there is a CGI call?

2. Call th following URL in internet: http://www.tdb-engine.de/scripts/set.prg
You'll get a list of (almost) all environment variables the http server (here an Internet Information Server) allocates to a CGI program. Using which of those environment variables you can identify user's browser software? What would be this entry for youre browser?

3. What is the CGI program on Yahoo replying the query called?

4. You have following form on a HTML side:

<form action="http://www.meinedomain.de/cgi-tdb/log_in.prg" method="get">
Vorname: <input type="text" name="Firstname"><br>
Name: <input type="text" name="Name"<br>
<input type="submit" name="command" value="send">
</form>

The user fills the fields with "Hans" and "Mueller" and hits "send".
What does the URL the client sendsto the server look like?

5. What does the environment variable QUERY_STRING contain in this case (Challenge 4)?



tdbengine Anwendungen im Web:

Open-Source Web CMS


Open-Source Bug-Tracking


Free wiki hosting

Open-Source Wiki-System

Kostenloses Foren-Hosting

Diät mit tdbengine 8-)

tdbengine chat
irc.tdbengine.org
#tdbengine

   Copyright © 2003-2004 tdb Software Service GmbH
   Alle rechte vorbehalten. / All rights reserved
   Last changed: 25.11.2003
{Fehler für :execmacro{execmacro="sessionspy"}


ranking-charts.de

Programmers Heaven - Where programmers go!