Chapter 758 Manifold Learning
The question raised by Yao Mengna is not difficult for Chang Haonan to understand.
It's just difficult to solve.
To be honest, this involves a series of issues such as text mining, data visualization, information retrieval, data mining, machine learning and even artificial intelligence.
If we can achieve fully automated production as envisioned by Yao Mengna, it will be Industry 4.0.
At this point in time in 1999, it was obviously unrealistic.
But the fact that it is impossible to fully realize this whole set of things does not mean that there are no parts that can serve as breakthroughs.
For example, data mining and information retrieval are very hot research directions around the millennium.
The core purpose is to extract valuable knowledge from massive databases and large amounts of complex information, and to further improve the utilization of information.
In fact, before Chang Haonan was reborn, the field of aircraft design and manufacturing had already begun to apply this technology, and he himself had been exposed to a lot.
But at that time, as an ordinary technician with an engineering background, he did not have much theoretical foundation.
Conversely, in most cases, the information collected in reality is itself high-dimensional data that has been expanded.
In fact, most of the problems we face in real life are also this kind of problem.
This resulted in a lot of nouns in his mind now, but he didn't know which one was the key to solving the problem——
3. Extract high-quality data features to improve the effect of subsequent data representation and classification tasks.
In fact, he was faced with the dilemma of being unable to extract valuable information from a large amount of complicated information.
1. Compress the original high-dimensional data to reduce the dimensionality of the original high-dimensional data, thereby saving storage space and reducing the computational complexity of the high-dimensional data.
That is, one-dimensional data.
no response.
"information…"
The word problems done in primary school and middle school are generally like this.
He went through these three items in his mind, and then tried to get the system to give a result.
But at the same time, this set of data often cannot only describe this one meaning.
In an idealized model, it is best if one piece of data can accurately and uniquely describe a meaning.
And if you want to let computers process these high-dimensional data...
2. Eliminate, or at least reduce, the noise hidden in the original high-dimensional data.
Chang Haonan pulled a piece of paper from the side and wrote two words in the middle of the paper.
For more complex situations, to fully describe a meaning, a set of data is often required.
This is a situation pushed into reality by mathematical theory.
Chang Haonan thought for a long time and wrote down three basic conditions on the paper:
As for the system, you first need to build a complete and feasible idea.
To mathematically describe the phenomenon that a set of (multiple) data corresponds to multiple meanings, it is necessary to expand a set of data in different dimensions.
Obviously, this cannot be regarded as a "complete and feasible" idea.
Unconsciously, Chang Haonan was sitting at his desk until it was almost time to have lunch.
Still not able to come up with a good idea.
Until a cry from the abdomen woke it up from deep thought.
I am indeed a little hungry.
Yao Mengna looked at a noun and three sentences on the paper, and knew that Chang Haonan probably had no idea, so she simply stood up and said:
“Would you like to have a meal first?”
“Also good.”
Chang Haonan is not the kind of person who is obsessed with things.
What's more, with regard to things like mathematics, I can't figure out anything by just thinking about it.
Without inspiration, nothing you say is of any use.
It's better to relax first and change your mind.
Fifteen minutes later, the three of them (together with Zhu Yadan) were already sitting around a round table on the second floor of the cafeteria.
This is a small restaurant with an a la carte menu. The price is a bit more expensive than the large canteen below. Plus, it has an extra floor, so there are not many people eating here.
On the contrary, the small supermarket next to it has a lot of people coming and going.
Chang Haonan had a steaming plate of mutton soup noodles in front of him, but he was not in a hurry to move his chopsticks. Instead, he was looking at the people going up and down the stairs not far away.
In the 1990s, instant noodles were still a very popular ready-to-eat food. When Chang Haonan was studying for his undergraduate degree, everyone's conditions were generally poor, and not many people had spare money to afford food.
But by this time in 1999, it was no longer uncommon for college students to keep a few bags or even a box in their dormitories.
“You said…”
Chang Haonan suddenly said:
“How do companies that produce instant noodles ensure that they don't miss out or overfill seasoning packets?”
Yao Mengna, who was eating with her head down, was stunned for a moment, and immediately realized that Chang Haonan was still thinking about the question she just raised.
Stuffing seasoning packets into instant noodles and driving rivets into airplanes are actually similar in terms of mathematical models.
Companies that produce instant noodles are obviously unlikely to have high-end equipment and technology.
“Probably...weigh?”
Yao Mengna guessed:
“The seasoning packet accounts for about 10% of the weight of the entire package of instant noodles. If you put less or more of it, it should be easy to detect.”
“Hmm...but there is an error in the weight of the dough itself, and there are several kinds of seasoning packets. Weighing can only prove that the total amount is OK, but it cannot guarantee that it is not wrong..."
Chang Haonan shook his head and said in denial.
Zhu Yadan next to him looked at Chang Haonan on the left and Yao Mengna on the right. He really didn't know why these two people suddenly discussed this issue.
"that…"
Although she felt that it was a bit awkward in front of the two doctors, she couldn't help it in the end:
“Before the packaging step, wouldn't it be enough to find someone to watch next to the assembly line?”
Yao Mengna holds her forehead with one hand:
“We are just thinking about how we can achieve the same effect without using this person.”
“Is this...”
Zhu Yadan shrank his head instantly:
"I just said it casually...but sometimes the role of the human brain may not be replaced..."
Calmness returned to the area around the dining table, except for the occasional faint sound of chewing.
But Chang Haonan still didn't move his chopsticks.
"you're right."
A few minutes later, when Zhu Yadan was about to finish the fried noodles on the plate in front of him, Chang Haonan suddenly said:
“The human brain can parse high-dimensional data in a certain way to gain a perception of the external world.”
Zhu Yadan raised his head with questions in his head, but looking at Chang Haonan's thinking, he knew enough not to disturb him.
“In other words, external information with high dimensions must be underlying a nonlinear manifold structure in a low-dimensional space…”
Nearly 70 years ago, American statistician Harold Hotelling proposed the principal component analysis method to reduce the dimensionality of high-dimensional data.
He believed that the greater the variance, the more information was provided, and conversely, the less information was provided, so he constructed several principal components with large variance and high information content through the linear combination of the original components, and then performed matrix singular value decomposition to reduce the data dimension. .
However, the principal component analysis method is only equivalent to finding the optimal linear mapping in the sense of minimizing the projection distance, but in reality there are not so many simple linear problems.
However, this idea can be used for reference.
Chang Haonan put down the mutton soup noodles that he had only eaten one bite, stood up and quickly left the canteen.
Zhu Yadan, who was responsible for security, quickly followed.
Yao Mengna's reaction was a little slow. Just as she was about to get up, she realized that she hadn't paid yet, so she had to take out her wallet and walked to the cashier helplessly.
Chang Haonan, who returned to the office, found the piece of paper again.
Write a few more lines below the three basic conditions.
Given a set of high-dimensional data X={x1, x2,...,xn}RD, n is the number of data samples, and D is the dimension of the high-dimensional data.
Assume that the data samples in
Find a mapping relationship from high-dimensional observation space to low-dimensional embedding space such that yi=(xi), and a one-to-one reconstruction mapping relationship ^-1 such that xi=^-1(yi).
Writing this, Chang Haonan showed a satisfied smile on his face.
Although he still has not given a complete idea, he has at least analyzed the three abstract basic conditions into a concrete mathematical problem.
For theoretical research, clearly raising questions is almost half of the way to success.
Thinking of this, he returned to the top of the paper and wrote six words again.
Manifold learning method.
(End of this chapter)
Chapter end
Report
|
Donate
Oh o, this user has not set a donation button.
|