Search engine definition
Systems that locate resources (web pages, files, pictures) on the WWW
Search engine index
A record of all the resources on the WWW
Created using software called web crawlers
Web crawler
Software that continuously crawl the web to discover and record publicly available web pages by following hyperlinks
Information stored in the index
URL of source
Context of source
Last time updated
Quality of resource
Meta tags
Meta tags
Describe the content of a web page, hidden from users but discoverable by web crawlers
Process of searching the web
Web crawlers search the web and copy data to index
Search engine looks through index
PageRank purpose
To list search results in the order of usefulness and relevance
How does the page rank algorithm work
Start by guessing one for each page and then repeat formula many times until accurate
PageRank algorithm
PR(A) = (1-d)+d(PR(Ti)/C(Ti)+…+PR(Tn)/C(Tn))
Meaning of d in PageRank algorithm
Dampening factor - probability of random web browser reaching a page - usually 0.85
PageRank factors
Domain name
Frequency of search term
Age
Frequency of page updates
Magnitude of content
Keywords in <h1> tags
Components of a network model
Client and server
Two types of network model
Client-server and peer-to-peer
Features of a client-server network
Central server used to manage security
Files and processing on central server
Clients issue requests to server
Features of a peer-to-peer network
No central server
All computers can see all files
If a computer is switched off, data can not be retrieved
Advantages of client-server
Good for all size organisations
Access levels centrally controlled
Centralised and automated backup
Disadvantages of client-server
Expensive to set up and manage
Advantages of peer-to-peer
Cheap to set up and maintain
Each computer can act as client and server
Can be used for coverage of live events
Client processing
Data is processed before it is sent to a server by the client
Usually in the form of scripts
Advantages of client processing (JavaScript)
Immediately responds to user action
Quick execution as no communication with server
Reduces load on server
Increases security for user as no chance for data interception
Disadvantages of client processing (JavaScript)
Not all browsers support all scripts
Dependent on performance of client’s machine
Different browsers process scripts differently
API
Application Programming Interface
API description
A set of rules and protocols that allows software applications to communicate with each other and exchange data features and functionality
Server side processing needs
Provide another layer of validation to user input
Display pages
Structure web applications
Interact with databases
Client side can be easily circumvented
Sensitive data
Way of processing is a company secret