Web search engine
The search technology provides local searchlocal search results in more than 1,400 cities. Yandex Search also features “parallel” search that presents results from both main web index and specialized information resources, including news, shopping, blogs, images and videos on a single page.
Yandex Search is responsive to real-time queries, recognizing when a query requires the most current information, such as breaking news or the most recent post on TwitterTwitter on a particular topic. It also contains some additional features: Wizard Answer, which provides additional information (for example, sports results), spell checkerspell checker, autocompleteautocomplete which suggests queries as-you-type, antivirus that detects malwaremalware on webpages and so on.
In May 2010, Yandex launched Yandex.com, a platform for beta testingbeta testing and improving non-Russian language search.
In 2009, Yandex launched MatrixNet, a new method of machine learning that significantly improves the relevance of search results. It allows the Yandex’ search engine to take into account a very large number of factors when it makes the decision about relevancy of search results.
Another technology, Spectrum, was launched in 2010. It allows inferring implicit queries and returning matching search results. The system automatically analyses users' searches and identifies objects like personal names, films or cars. Proportions of the search results responding to different user intents are based on the user demand for these results.
With the first release on July 21, 2017, BraveBrave web browser features Yandex as one of its default search engines.
The search engine consists of three main components:
The search engine is also able to index text inside Shockwave Flash objects (if the text is not placed on the image itself), if these elements are transferred as a separate page, which has the MIME type application/x-shockwave-flash , and files with the extension .swf
Yandex has 2 scanning robots - the “main” and the “fast”. The first is responsible for the whole Internet, the second indexes sites with frequently changing and updating information (news sites and news agencies). In 2010, the “fast” robot received a new technology called “Orange”, developed jointly by the California and Moscow divisions of Yandex.
Since 2009, Yandex has supported Sitemaps technology.
In the server logs, Yandex robots are represented as follows:
Mozilla/5.0 (compatible; YandexAddurl/2.0) - is a search robot hat indexes pages through the "Add URLURL" form.
Yandex, automatically, along with the original “exact form” of the query, searches for its various variations and formulations.
The Yandex search takes into account the morphology of the Russian language, therefore, regardless of the form of the word in the search query, the search will be performed for all word forms. If morphological analysismorphological analysis is undesirable, you can put an exclamation mark (!) Before the word - the search in this case will show only the specific form of the word. In addition, the search query practically does not take into account the so-called stop-wordsstop-words, that is, prepositionsprepositions, punctuationpunctuation, pronouns, etc., due to their wide distribution
As a rule, abbreviations are automatically disclosing, spelling is correcting. It also searches for synonyms (mobile - cellular). The extension of the original user request depends on the context. Expansion does not occur when a set of highly specialized terms, names of proper names of companies (for example, OJSC “Hippo” - OJSC “Hippopotamus”), adding the word “price”, in exact quotes (these are queries highlighted with typewriter quotes).
Search results for each user are formed individually based on their location, language of a query, interests and preferences based on the results of previous and current search sessions. However, the key factor in ranking search results is their relevance to the search query. Relevance is determined based on a ranking formula, which is constantly updated based on machine learning algorithms.
The search is performed in Russian, EnglishEnglish, FrenchFrench, GermanGerman, UkrainianUkrainian, BelarusianBelarusian, TatarTatar, KazakhKazakh.
The page with the search results consists of 10 links with short annotations - “snippets”. The snippets includes a text comment, link, address, popular sections of the site, pages on social networks, etc. As an alternative to snippets, Yandex introduced in 2014 a new interface called “Islands”.
Yandex implements the “parallel searches” mechanism, when together with a web search, a search is performed on Yandex services, such as Catalog, News, Market, Encyclopedias, Images, etc. As a result, in response to a user’s request, the system shows not only textual information, but also links to video files, pictures, dictionary entries, etc.
A distinctive feature of the search engine is also the technology of "intent search" that mean a search for solving a problem. Intent search elements are - dialog prompts in case of ambiguous request, automatic text translation, information about the characteristics of the requested car, etc. For example, when you request “Boris GrebenshchikovBoris Grebenshchikov - Golden City”, the system will show a form for online listening to music from the Yandex MusicYandex Music service, at the request of "st. Koroleva 12 " will be shown a fragment of the mapmap with the marked object on it.
In 2013, Yandex was considered by some to be the safest search engine at the time and the third most secure among all web resources. By 2016, Yandex had slipped down to third with Google being first.
Checking web pages and warning users appeared on Yandex in 2009: since then, on the search results page, next to a dangerous site there is a note “This site may threaten the security of your computer”. Two technologies at once are used to detect threats. The first was purchased from the American antivirus Sophos and based on a signature approach: that means, when accessing a web page, the antivirusantivirus system also accesses a database of already known viruses and malware. This approach is fast, but practically powerless against new viruses that have not yet entered the database. Therefore, Yandex along with the signature also uses its own antivirus complex, based on an analysis of the behavioral factor. The Yandex program, when accessing the site, checks whether the latter requested additional files from the browser, redirected it to an extraneous resource, etc. Thus, if information is received that the site begins to perform certain actions (cascading style sheets, JavaScriptJavaScript modules are launched and complete programs) without user permission, it is placed in the “black list” and in the database of virus signatures. Information about the infection of the site appears in the search results, and through the Yandex.Webmaster service the owner of the site receives a notification. After the first check, Yandex does the second, and if the infection information is confirmed a second time, the checks will be more frequent until the threat is eliminated. The total number of infected sites in the Yandex database does not exceed 1%.
Every day in 2013, Yandex checks 23 million web pages (while detecting 4,300 dangerous sites) and shows users 8 million warnings.[23] Approximately one billion sites are checked monthly.
For a long time, the key ranking factor for Yandex was the number of third-party links to a particular site. Each page on the Internet was assigned a unique citation index, similar to the index for authors of scientific articles: the more links, the better. A similar mechanism was implemented in the Yandex and in the Google’s PageRankPageRank. In order to prevent cheatingcheating, Yandex uses multivariate analysis, in which only 70 of the 800 factors are affected by the number of third-party links. Today, the content of the site and the presence or absence of keywords there, the ease of reading the text, the name of the domain, its history and the presence of multimedia content play a much greater role.
On December 5, 2013, Yandex announced a complete refusal of accounting the link factor in the future.
As the user types the query in the search bar, the search engine offers hints in the form of a drop-down list. Hints appear even before the search results appears and allow you to refine the query, correct the layout or typo, or go directly to the site you are looking for. For each user, hints are generated, including on the history of his search queries (My Finds service). In 2012, the so-called “Smart Search Hints” appeared, which instantly give out information about the main constants (equator length, speed of light, and so on), traffic jams, and have a built-in calculator. In addition, a translator was integrated in the “Hints” (the query “love in French” instantly gives out amour, affection ), the schedule and results of football matches, exchange rates, weather forecasts and more. You can find out the exact time by asking "what time is it." In 2011, Hints in the search for Yandex became completely local to 83 regions of Russia.
In addition to the actual search, Hints are built into Yandex search engines. Dictionaries ”,“ Yandex. MarketYandex. Market ”,“ Yandex. MapsYandex. Maps "and other Yandex services.
The hint function is a consequence of the development of the technology of intent search and first appeared on Yandex.Bar in August 2007, and in October 2008 it was introduced on the main page of the search engine. Available both in the desktop and mobile versions of the site, Yandex shows its users more than a billion search hints per day
According to media expert Mikhail Gurevich, Yandex is a “national treasure”, a “strategic product”.
This fact was also recognized in the State DumaState Duma of the Russian FederationRussian Federation, where in May 2012 a bill appeared in which Yandex and VKontakteVKontakte are recognized by strategic enterprises as national information translators. In 2009, President of Russia Dmitry MedvedevDmitry Medvedev initiated the purchase of a “golden sharegolden share” of YandexYandex by SberbankSberbank in order to avoid an important nationwide company falling into foreign hands.
In 2012, Yandex overtook Channel OneChannel One in terms of daily audience, which made the Yandex a leader in the domestic media market.] In 2013, Yandex confirmed this status, overtaking First in terms of revenue.
In 2008, Yandex was the ninth search engine in the world, in 2009 the seventh, and in 2013 the fourth.
One of the components of this situation is the presence in Russia of a sufficient number of mathematically savvy specialists with a scientific instinct.
By 2002, the word Yandex became so common that when Arkady Volozh`sArkady Volozh`s company demanded to return the yandex.com domain, bought by third parties, the defendant stated that the word "Yandex" was already synonymous with the search and became a household word in Russia.
Since late 2012, the Yandex search enginesearch engine has outperformed the number of GoogleGoogle users on the Google ChromeGoogle Chrome browser in RussiaRussia.
2008
2007
2006
In early December, next to each link in the results of search appeared the item “Saved copy”, clicking on which, the user goes to a full copy of the page in a special archive databasedatabase (“Yandex cache”)
2005
The ranking algorithm has been improved to increase search accuracy.
It became possible to limit search results by region.
2004
At the end of the year, the study “Some Aspects of Full-Text Search and Ranking in Yandex” was published (authors Ilya Segalovich, Mikhail Maslov ), which revealed certain ranking details in a search engine.
2003
2002
2001
2000
In December 2000, the volume of indexed information reached 355.22 GB.
1990
The word stands for yet another indexer (or as “ I am ("ya" in Russian language) and index )”. According to the interpretation of Artemy LebedevArtemy Lebedev, the name of the search engine is consonant with Yandeks, where yang means the masculine beginning.
The yandex.ru search engine was announced by CompTek on September 23, 1997 at the Softool exhibition, although some developments in the field of search (BibleBible indexing, searching for documents on CD-ROMCD-ROM, site search) were carried out by the company even earlier.
The first index contained information on 5 thousand servers and occupied 4.5 GB.
In the same 1997, the search for Yandex began to be used in the Russian version of Internet ExplorerInternet Explorer 4.0. It became possible to query in natural language.
“Yandex. Search ”as of 1998 worked on three machines running on FreeBSDFreeBSD under ApacheApache: one machine crawled the Internet and indexed documents, one search engine, and one machine duplicated the search engine.
In 1999, a search appeared in the categories - search, a combination of a search engine and a catalog. The version of the search engine was updated.
Web search engine
The search technology provides local search results in more than 1,400 cities. Yandex Search also features “parallel” search that presents results from both main web index and specialized information resources, including news, shopping, blogs, images and videos on a single page.
Yandex Search is responsive to real-time queries, recognizing when a query requires the most current information, such as breaking news or the most recent post on Twitter on a particular topic. It also contains some additional features: Wizard Answer, which provides additional information (for example, sports results), spell checker, autocomplete which suggests queries as-you-type, antivirus that detects malware on webpages and so on.
In May 2010, Yandex launched Yandex.com, a platform for beta testing and improving non-Russian language search.
The search product can be accessed from personal computers, mobile phones, tablets and other digital devices. In addition to web search, Yandex provides a wide range of specialized search services.
In 2009, Yandex launched MatrixNet, a new method of machine learning that significantly improves the relevance of search results. It allows the Yandex’ search engine to take into account a very large number of factors when it makes the decision about relevancy of search results.
Another technology, Spectrum, was launched in 2010. It allows inferring implicit queries and returning matching search results. The system automatically analyses users' searches and identifies objects like personal names, films or cars. Proportions of the search results responding to different user intents are based on the user demand for these results.
With the first release on July 21, 2017, Brave web browser features Yandex as one of its default search engines.
The search engine consists of three main components:
In general, Yandex indexes the following file types : html, pdf, rtf, doc, xls, ppt, docx, odt, odp, ods, odg, xlsx, pptx.
The search engine is also able to index text inside Shockwave Flash objects (if the text is not placed on the image itself), if these elements are transferred as a separate page, which has the MIME type application/x-shockwave-flash , and files with the extension .swf
Yandex has 2 scanning robots - the “main” and the “fast”. The first is responsible for the whole Internet, the second indexes sites with frequently changing and updating information (news sites and news agencies). In 2010, the “fast” robot received a new technology called “Orange”, developed jointly by the California and Moscow divisions of Yandex.
Since 2009, Yandex has supported Sitemaps technology.
In the server logs, Yandex robots are represented as follows:
Mozilla/5.0 (compatible; YandexBot/3.0) is the main indexing robot.
Mozilla/5.0 (compatible; YandexBot/3.0; MirrorDetector) - a robot that detects site mirrors. If there are several sites with the same content, only one will be shown in the search results.
Mozilla/5.0 (compatible; YandexImages/3.0) - indexer Yandex. Images.
Mozilla/5.0 (compatible; YandexVideo/3.0) - indexer Yandex. Video.
Mozilla/5.0 (compatible; YandexMedia/3.0) - indexer multimedia data.
Mozilla/5.0 (compatible; YandexBlogs/0.99; robot) is a search robot that indexes post comments.
Mozilla/5.0 (compatible; YandexAddurl/2.0) - is a search robot hat indexes pages through the "Add URL" form.
Mozilla/5.0 (compatible; YandexDirect/2.0; Dyatel) - checking Yandex. Direct.
Mozilla/5.0 (compatible; YandexMetrika/2.0) - indexer Yandex. Metrics.
Mozilla/5.0 (compatible; YandexCatalog/3.0; Dyatel) - checking Yandex. Catalog.
Mozilla/5.0 (compatible; YandexNews/3.0) - indexer Yandex. News.
Mozilla/5.0 (compatible; YandexAntivirus/2.0) - Yandex anti-virus robot.
The following operators are used for setting:
"" - exact quote
| - enter between words, if you need to find one of them
* - enter between words, if some word is missing
site: - search on a specific site
date: - search for documents by date, for example, date: 2007
+ - enter before the word, that should be in the document
Yandex, automatically, along with the original “exact form” of the query, searches for its various variations and formulations.
The Yandex search takes into account the morphology of the Russian language, therefore, regardless of the form of the word in the search query, the search will be performed for all word forms. If morphological analysis is undesirable, you can put an exclamation mark (!) Before the word - the search in this case will show only the specific form of the word. In addition, the search query practically does not take into account the so-called stop-words, that is, prepositions, punctuation, pronouns, etc., due to their wide distribution
As a rule, abbreviations are automatically disclosing, spelling is correcting. It also searches for synonyms (mobile - cellular). The extension of the original user request depends on the context. Expansion does not occur when a set of highly specialized terms, names of proper names of companies (for example, OJSC “Hippo” - OJSC “Hippopotamus”), adding the word “price”, in exact quotes (these are queries highlighted with typewriter quotes).
Search results for each user are formed individually based on their location, language of a query, interests and preferences based on the results of previous and current search sessions. However, the key factor in ranking search results is their relevance to the search query. Relevance is determined based on a ranking formula, which is constantly updated based on machine learning algorithms.
The search is performed in Russian, English, French, German, Ukrainian, Belarusian, Tatar, Kazakh.
Search results can be sorted by relevance and by date (buttons below the search results).
The page with the search results consists of 10 links with short annotations - “snippets”. The snippets includes a text comment, link, address, popular sections of the site, pages on social networks, etc. As an alternative to snippets, Yandex introduced in 2014 a new interface called “Islands”.
Yandex implements the “parallel searches” mechanism, when together with a web search, a search is performed on Yandex services, such as Catalog, News, Market, Encyclopedias, Images, etc. As a result, in response to a user’s request, the system shows not only textual information, but also links to video files, pictures, dictionary entries, etc.
A distinctive feature of the search engine is also the technology of "intent search" that mean a search for solving a problem. Intent search elements are - dialog prompts in case of ambiguous request, automatic text translation, information about the characteristics of the requested car, etc. For example, when you request “Boris Grebenshchikov - Golden City”, the system will show a form for online listening to music from the Yandex Music service, at the request of "st. Koroleva 12 " will be shown a fragment of the map with the marked object on it.
In 2013, Yandex was considered by some to be the safest search engine at the time and the third most secure among all web resources. By 2016, Yandex had slipped down to third with Google being first.
Checking web pages and warning users appeared on Yandex in 2009: since then, on the search results page, next to a dangerous site there is a note “This site may threaten the security of your computer”. Two technologies at once are used to detect threats. The first was purchased from the American antivirus Sophos and based on a signature approach: that means, when accessing a web page, the antivirus system also accesses a database of already known viruses and malware. This approach is fast, but practically powerless against new viruses that have not yet entered the database. Therefore, Yandex along with the signature also uses its own antivirus complex, based on an analysis of the behavioral factor. The Yandex program, when accessing the site, checks whether the latter requested additional files from the browser, redirected it to an extraneous resource, etc. Thus, if information is received that the site begins to perform certain actions (cascading style sheets, JavaScript modules are launched and complete programs) without user permission, it is placed in the “black list” and in the database of virus signatures. Information about the infection of the site appears in the search results, and through the Yandex.Webmaster service the owner of the site receives a notification. After the first check, Yandex does the second, and if the infection information is confirmed a second time, the checks will be more frequent until the threat is eliminated. The total number of infected sites in the Yandex database does not exceed 1%.
Every day in 2013, Yandex checks 23 million web pages (while detecting 4,300 dangerous sites) and shows users 8 million warnings.[23] Approximately one billion sites are checked monthly.
For a long time, the key ranking factor for Yandex was the number of third-party links to a particular site. Each page on the Internet was assigned a unique citation index, similar to the index for authors of scientific articles: the more links, the better. A similar mechanism was implemented in the Yandex and in the Google’s PageRank. In order to prevent cheating, Yandex uses multivariate analysis, in which only 70 of the 800 factors are affected by the number of third-party links. Today, the content of the site and the presence or absence of keywords there, the ease of reading the text, the name of the domain, its history and the presence of multimedia content play a much greater role.
On December 5, 2013, Yandex announced a complete refusal of accounting the link factor in the future.
As the user types the query in the search bar, the search engine offers hints in the form of a drop-down list. Hints appear even before the search results appears and allow you to refine the query, correct the layout or typo, or go directly to the site you are looking for. For each user, hints are generated, including on the history of his search queries (My Finds service). In 2012, the so-called “Smart Search Hints” appeared, which instantly give out information about the main constants (equator length, speed of light, and so on), traffic jams, and have a built-in calculator. In addition, a translator was integrated in the “Hints” (the query “love in French” instantly gives out amour, affection ), the schedule and results of football matches, exchange rates, weather forecasts and more. You can find out the exact time by asking "what time is it." In 2011, Hints in the search for Yandex became completely local to 83 regions of Russia.
In addition to the actual search, Hints are built into Yandex search engines. Dictionaries ”,“ Yandex. Market ”,“ Yandex. Maps "and other Yandex services.
The hint function is a consequence of the development of the technology of intent search and first appeared on Yandex.Bar in August 2007, and in October 2008 it was introduced on the main page of the search engine. Available both in the desktop and mobile versions of the site, Yandex shows its users more than a billion search hints per day
According to media expert Mikhail Gurevich, Yandex is a “national treasure”, a “strategic product”.
This fact was also recognized in the State Duma of the Russian Federation, where in May 2012 a bill appeared in which Yandex and VKontakte are recognized by strategic enterprises as national information translators. In 2009, President of Russia Dmitry Medvedev initiated the purchase of a “golden share” of Yandex by Sberbank in order to avoid an important nationwide company falling into foreign hands.
In 2012, Yandex overtook Channel One in terms of daily audience, which made the Yandex a leader in the domestic media market.] In 2013, Yandex confirmed this status, overtaking First in terms of revenue.
In 2008, Yandex was the ninth search engine in the world, in 2009 the seventh, and in 2013 the fourth.
One of the components of this situation is the presence in Russia of a sufficient number of mathematically savvy specialists with a scientific instinct.
By 2002, the word Yandex became so common that when Arkady Volozh`s company demanded to return the yandex.com domain, bought by third parties, the defendant stated that the word "Yandex" was already synonymous with the search and became a household word in Russia.
Since late 2012, the Yandex search engine has outperformed the number of Google users on the Google Chrome browser in Russia.
2008
2007
2006
In early December, next to each link in the results of search appeared the item “Saved copy”, clicking on which, the user goes to a full copy of the page in a special archive database (“Yandex cache”)
2005
The ranking algorithm has been improved to increase search accuracy.
Search capabilities have been expanded with the help of Yandex. Dictionaries ”and“ Yandex. Lingvo ". The search engine has learned to understand queries like “What is [something] in Spanish” and automatically translate them.
It became possible to limit search results by region.
2004
Yandex began indexing documents in .swf (Flash).xls and .ppt formats.
At the end of the year, the study “Some Aspects of Full-Text Search and Ranking in Yandex” was published (authors Ilya Segalovich, Mikhail Maslov ), which revealed certain ranking details in a search engine.
2003
2002
2001
2000
In December 2000, the volume of indexed information reached 355.22 GB.
1990
The word stands for yet another indexer (or as “ I am ("ya" in Russian language) and index )”. According to the interpretation of Artemy Lebedev, the name of the search engine is consonant with Yandeks, where yang means the masculine beginning.
The yandex.ru search engine was announced by CompTek on September 23, 1997 at the Softool exhibition, although some developments in the field of search (Bible indexing, searching for documents on CD-ROM, site search) were carried out by the company even earlier.
The first index contained information on 5 thousand servers and occupied 4.5 GB.
In the same 1997, the search for Yandex began to be used in the Russian version of Internet Explorer 4.0. It became possible to query in natural language.
In 1998, the function “find similar documents” appeared for each search result.
“Yandex. Search ”as of 1998 worked on three machines running on FreeBSD under Apache: one machine crawled the Internet and indexed documents, one search engine, and one machine duplicated the search engine.
In 1999, a search appeared in the categories - search, a combination of a search engine and a catalog. The version of the search engine was updated.