• Slide Image 1 Title | Welcome to D5 Smartia Theme, Visit D5 Creation for Details

    You can use D5 Smartia for Black and White looking Smart Blogging, Personal or Corporate Websites. This is a Sample Description and you can change these from Samrtia Options
  • Slide Image 2 Title | Welcome to D5 Smartia Theme, Visit D5 Creation for Details

    You can use D5 Smartia for Black and White looking Smart Blogging, Personal or Corporate Websites. This is a Sample Description and you can change these from Samrtia Options
  • Slide Image 3 Title | Welcome to D5 Smartia Theme, Visit D5 Creation for Details

    You can use D5 Smartia for Black and White looking Smart Blogging, Personal or Corporate Websites. This is a Sample Description and you can change these from Samrtia Options

Research on the algorithm of web page slicing

Posted by: | Posted on: July 24, 2017

recently in the study on slicing algorithm ", probably a lot of people do not know what is the slicing algorithm, this is actually a principle oriented web search engine block and slice, now with the deepening of the work, has encountered a variety of problems, specifically in the following areas.

The granularity problem of

web pages:

The goal of the

page slicing algorithm is not to pinpoint what is needed, but to identify the various functional areas that partition the page, navigation, links, content, footers, and ad areas.

page slice of web object:

interconnect gauze function of the website is about 2 types of directory type and content type; with the development of the search engine, website structure gradually to the flat direction, have also made the data validation, and with the display resolution continues to improve, "a combination of content and directory type increase trend.

The object of the

page slicing algorithm should be for content and content directory blending. For different pages, there should be an identification algorithm, which standards should be included?

web content area maximum range identification:

from the size of the slice, you can see that the content area should be cut as a separate part. According to the general rules of web design, there are generally 2 ways to accommodate content areas: 1, including type (such as blog), 2 and parallel (such as BBS post).

if you handle paged content pages:

most web sites now do paging processing of web pages in order to improve user experience and increase the number of page displays. This part needs to be set aside. Inadvertently see: VIPS: vision based paging algorithm for Web pages, theoretically proved the feasibility of this method. However, there are many obstacles to implementation, as the one said: "I used floating instead of relative positioning with absolute positioning, and dynamically arranged in the client’s JavaScript.". Client objects are dynamically generated by scripts. Such algorithms are too dependent on specific implementations, and it is difficult to have a good solution. Moreover, dynamic rendering of the current client dependent scripting is slowly gaining popularity, making it hard to adapt to future trends.

take the most simple, I have a similar to the OutLook toolbar style, are script generation, visual analysis can only be settled to the visual, static picture only on the page for analysis in order to get correct compartmentalization, compartmentalization is simply algorithm is easy to do, but to put these contents due to the split bar is difficult. There is only one good way to simulate the mouse keystroke, and the object at the keystroke returns the response, which can be achieved in IE. Only in this way can we get the object belonging to the partition.

vision relies on screen segmentation. It’s easy to expand and reduce the space, so that the space becomes clearer





Leave a Reply

Your email address will not be published. Required fields are marked *