Research on the algorithm of web page slicingPosted by: admin | Posted on: July 24, 2017
recently in the study on slicing algorithm ", probably a lot of people do not know what is the slicing algorithm, this is actually a principle oriented web search engine block and slice, now with the deepening of the work, has encountered a variety of problems, specifically in the following areas.
The granularity problem of
The goal of the
page slicing algorithm is not to pinpoint what is needed, but to identify the various functional areas that partition the page, navigation, links, content, footers, and ad areas.
page slice of web object:
interconnect gauze function of the website is about 2 types of directory type and content type; with the development of the search engine, website structure gradually to the flat direction, have also made the data validation, and with the display resolution continues to improve, "a combination of content and directory type increase trend.
The object of the
page slicing algorithm should be for content and content directory blending. For different pages, there should be an identification algorithm, which standards should be included?
web content area maximum range identification:
from the size of the slice, you can see that the content area should be cut as a separate part. According to the general rules of web design, there are generally 2 ways to accommodate content areas: 1, including type (such as blog), 2 and parallel (such as BBS post).
if you handle paged content pages:
take the most simple, I have a similar to the OutLook toolbar style, are script generation, visual analysis can only be settled to the visual, static picture only on the page for analysis in order to get correct compartmentalization, compartmentalization is simply algorithm is easy to do, but to put these contents due to the split bar is difficult. There is only one good way to simulate the mouse keystroke, and the object at the keystroke returns the response, which can be achieved in IE. Only in this way can we get the object belonging to the partition.
vision relies on screen segmentation. It’s easy to expand and reduce the space, so that the space becomes clearer