A picture shows you how IBM Waston works

The IBM Watson system was created by Thomas Watson Sr, the CEO of modern IBM. In 2011, the system defeated two human champions in the American TV quiz show Jeopardy! Become famous.

The Jeopardy! (Dangerous Tour!) game is conducted in a unique question and answer format. The problem set covers a wide range of topics involving history, literature, art, popular culture, technology, sports, geography, word games, and more. . Based on the various clues provided in the form of answers, contestants must make short and correct answers in the form of questions. Contrary to the general Q&A program, Dangerous Edge answers questions and asks questions in the form of answers. Participants must have knowledge of history, literature, politics, science, and popular culture. They also need to understand hidden meanings, irony, and riddles. Computers are not good at such complicated thinking.

What needs to be mentioned is that Watson is offline in the game. Unlike AlphaGo and Li Shishi, Waston can only use the knowledge base package and extension package stored in the hard disk as his own knowledge reserve, just like human contestants. In this situation, Watson tied with opponents in the first two rounds. In the final episode, Watson defeated top prize winner Brad Ruttle and winning streak holder Ken Jennings and won first place.

Waston is essentially a computer Q&A system manufactured by IBM. IBM said that "Waston is an application of open question and answer techniques such as advanced natural language processing, information retrieval, knowledge representation, automatic reasoning, and machine learning," and " Based on DeepQA technology developed for hypothesis cognition and large-scale evidence collection, analysis and evaluation." Although some of the techniques used in deep learning such as Transfer Learning are used to solve some problems, unlike AlphaGo, it is not an artificial intelligence that completely adopts deep learning technology. Its main idea is not deep learning, but closer to the Society of Mind.

At the just-concluded IJCAI2016 conference, Rensselaer Polytechnic Professor James Hendler was invited to give a detailed presentation on how IBM Waston works.

Thinking of Watson as a circular office corridor, each office has a group of people doing special work, let us sort out the entire operation from scratch.

QUESTION IN

Question Analysis

What does this question mean?

What are we looking for?

Is there any other valid information?

Are there any words in the question that mention other words in the question?

In this session, DeepQA tries to understand the problem, find out exactly what the problem is asking, and do some preliminary analysis to decide which method to choose to deal with this problem.

Initial Search Primary Search

Can you find any files that may be related to this issue in the database?

How many files were found?

Where do these documents come from?

Search results Processing and generating alternative answers Search Result Processing and Candidate Answer Generation

Is there a possible answer to this question in these documents?

How many alternative answers?

When asked a question, a document opened. As documents move around the office, more information is added.

Context-Independent Answer Scoring

Is this choice likely to be the correct answer?

Is this choice the correct form of answer?

Soft Filtering Soft Filtering

What are the obvious mistakes?

If so, can we not allow them to take up too much time in the process?

Each possible answer is given a score, given how good the evidence supports the alternative answers.

Supporting Evidence Search Support Evidence Retrieval

Can you find any information in the database that proves that an answer is correct?

How much information is there for each choice?

Search Result Processing and Context Dependent Scoring

How about choosing a job as a question?

Now that there is more information, what scores can be chosen for each?

Final merger and ranking Final Merging and Ranking

Any additional information that can change the score?

What is the total score for each choice?

Which is the highest score?

What is the second highest score?

DeepQA has also observed this phenomenon: different surface forms are usually supported by different evidence and get completely different but potentially complementary scores, which has produced a way to merge the answer scores before ranking and confidence calculations.

The answer ANSWER OUT

The useful highest score answer is returned and Watson tries to judge how good (or bad) it is doing.

The above is the basic principle of waston's work. Professor James Hendler mentioned in the following speech that Watson is based on "associated knowledge" and its implementation process is as follows.

Parse the problem text

Generate a query request

Search for possible answers (hypotheses)

Assumption 1-m

Analyze hypothesis text

Generate a query request

Search evidence

Evidence 1.1-1.m

Parse evidence text

Evidence score

Combine evidence and score hypothesis 1

By reading vaguely ambiguous questions and searching for answers through its universal knowledge database, Watson demonstrated the ability to understand natural language, which is one of the most difficult problems that computers must overcome. This seems to indicate that computers will soon be able to “understand” complex information and talk to humans, and may even continue to evolve beyond humanity in most human-specific areas.

Watson integrates hundreds of algorithms to analyze evidence for alternative hypotheses from different dimensions, such as type, time, space, popularity, paragraph support, source reliability, and semantic relevance. Each analysis produces some characteristics or scores that reflect the extent to which the evidence supports the alternative answers in the corresponding dimensions. If any single scorer is removed from the final system, there will be no significant effect on the test set of thousands of questions. In fact, none of the scorers will have an impact of more than 1%. But in combination, Watson reached an average accuracy of 92% when answering the 40%-70% problem.

Prof. James Hendler also mentioned in the speech that several important technical barriers have been broken recently, and that deep learning (machine learning), cognitive computing, and semantic networks (knowledge maps) have once again brought high attention to AI. In addition, he also mentioned that the Waston system has already begun to enter the medical industry. Combining the previously mentioned AlphaGo is also intended to generate heat in this industry. Will we see that the two are higher? Please wait and see.

Via James Hendler

PS : This article was compiled by Lei Feng Network (search “Lei Feng Network” public number) and it was compiled without permission.

This entry was posted in on