The overall increase in traffic on the World Wide Web is augmenting user-perceived response times from popular Web sites, especially in conjunction with special events. System platforms that do not replicate information content cannot provide the needed scalability to handle large traffic volumes and to match rapid and dramatic changes in the number of c lients. The need to improve the performance of Web-based services has produced a variety of novel content delivery architectures. This paper will focus on Web system architectures that consist of multiple server nodes distributed on a local area, with one or more mechanisms to spread client requests among the nodes. After years of continual proposals of new system solutions, routing mechanisms, and policies (the first dated back to 1994 when the NCSA Web site had to face the first million of requests per day), many problems concerning multiple server architectures for Web sites have been solved. Other issues remain to be addressed, especially at the network application layer, but the main techniques and methodologies for building scalable Web content delivery architectures placed in a single location are settled now. This paper classifies and describes main mechanisms to split the traffic load among the server nodes, discussing both the alternative architectures and the load sharing policies. To this purpose, it focuses on architectures, internal routing mechanisms, and dispatching request algorithms for designing and implementing scalable Web-server systems under the control of one content provider. It identifies also some of the open research issues associated with the use of distributed systems for highly accessed Web sites.