
Web Content Mining from Hidden Web
A Methodical Web Mining Approach for Automated Information Extraction from Dynamic Web pages
LAP Lambert Academic Publishing
Published on 23. May 2011
Book
Paperback/Softback
84 pages
978-3-8443-8086-6 (ISBN)
Description
World Wide Web is enormous compilation of multi- variant data. For better knowledge management it is important to retrieve accurate and complete data. The hidden Web, also known as the invisible Web or deep Web, has given rise to a new issue of Web mining research. Most documents in the hidden Web, including pages hidden behind search forms, specialized databases, and dynamically generated Web pages, are not accessible by general Web mining application. In this paper a system is designed that has a robust ability to access these hidden web pages using web structure mining techniques for better knowledge management. As dynamic content generation is used in modern web pages and user forms are used to get information from a particular user and stored in a database. The link structure lying in these forms can not be accessed during conventional mining procedures. The accuracy ratio of web page hierarchical structures can be improved by including these hidden web pages in the process of Web structure mining. The designed system is adequately strong to process the dynamic Web pages along with static ones.
More details
Language
English
Place of publication
Germany
Product notice
Paperback (trade)
Unsewn / adhesive bound
Dimensions
Height: 220 mm
Width: 150 mm
Thickness: 6 mm
Weight
143 gr
ISBN-13
978-3-8443-8086-6 (9783844380866)
Copyright in bibliographic data and cover images is held by Nielsen Book Services Limited or by the publishers or by their respective licensors: all rights reserved.
Schweitzer Classification
Persons
M. Asif Naeem holds a PhD from University of Auckland, New Zealand. He did his MS from Balochistan University of Information Technology and Management Science, Pakistan in 2006. His research interests include online stream processing, data management and integration, business intelligence, and web mining.