Gadgets News

IBM’s CodeNet system can teach AI to translate computer languages


AI and machine learning systems have been able to perform well in recent years, not only in understanding the written word but also in writing. But even though the technology is about to become fluent in English, they still have to master the computer language – that is, until now. IBM announced at its Think 2021 conference on Monday that its researchers have developed Rosetta Stone for software development.

Over the past decade, advances in AI in particular have been “driven by deep-seated networks, and in spite of this, they are driven by three main elements: knowledge acquired by major training sets, innovation in new systems, and greater acceleration of fast and efficient running hardware. and GPUs, “said Ruchir Puri, IBM Fellow and Chief Scientist at IBM Research, at his Think 2021 event, to compare the innovations created by the prestigious ImageNet, which has brought about the latest computer vision.

“Software is consuming the world,” Marc Andreessen wrote in 2011. “And if software is consuming the world, AI is consuming software,” Puri told Engadget. “It is the relationship between the show and the language services, where similar approaches can be used, which has led to a change in language planning, since the arrival of Watson Dangerous, back in 2012, ”he continued.

By the way, we have taught computers the way people speak, so why not teach computers to speak more computers? This is what IBM’s Project CodeNet aims to achieve. “We want our ImageNet, which can cause snow and be able to deliver this technology,” said Puri. CodeNet is actually a computer ImageNet. It is an interconnected network designed to teach AI / ML systems how to interpret codes and has 14 million other small pamphlets and 500 million lines spread across more than 55 languages ​​and inspiring languages ​​- from COBOL and FORTRAN to Java, C ++, and Python.

“Since the site is self-contained in 50 different languages, it can enable a wide range of algorithms in combination,” Puri said. “Having said that, there have been applications in human languages, such as neural translation machines that, instead of just reading the two, become independent of the languages ​​and are able to translate in the middle of their translation into different languages.” In short, the figure is generated in a way that supports two-dimensional translations. So, you can take a legacy of COBOL heritage – which, horribly, remains the majority of governments and corporations in this country – and translate it into Java as easily as you can take a summary of Java and restore it to COBOL.

“We believe that language development and machine learning can be used to understand programming languages ​​by making critical thinking and decision-making, by articulating decisions, as we do with computer visual and linguistic planning,” he said.

But, as with other languages, computers are designed to be better understood. However, in contrast to our two-dimensional languages, “experimental languages ​​can be compared, in short, in terms of ‘the program ends, does the program do what is supposed to be difficult and, if there is a test, does it know, solve, and complete the test tests,’ Therefore, CodeNet can be used as a scanning and code recognition tool, combining its translation functions and as a visual aid. In addition, each type is recorded with its own CPU operating time and memory memory, allowing researchers to resume training and develop machine tools. .

Project CodeNet has more than 14 million examples plus 4000-plus content challenges that have been collected and edited since decades of software and competition challenges around the world. “As it stands,” Puri said, “there are many types of software competitions and all kinds of problems – some are like business, some are educational. These are the languages ​​that have been used in the last decade and a half in these competitions where 1000 students or competitors presented. answers. ”

In addition, users are able to manipulate anyone’s distractions “to extract metadata and validate results from the AI’s artificial intelligence,” according to IBM’s findings. “This will enable researchers to create a consistent program and translate into a single language.”

Although this dataset can be used to create new codes, as GPT-3 does in English, the power of CodeNet lies in its ability to translate. “We’re really trying to do what ImageNet did with computer viewing,” he said. “It has completely changed the game, it has changed a lot and it has a lot of information. We hope that CodeNet, with its various functions, data volume, and quantity, will bring the same benefits.” In addition, Puri estimates that more than 80 percent of these problems will occur. has over 100 answers, and offers a lot of answers.

“We are very happy with this,” Puri said. “We hope and hope that this will be the same as ImageNet’s ever-changing view of computers.” IBM seeks to release CodeNet information publicly, allowing searchers around the world equal and free access.

All the products that Engadget selected were selected by our management team, independent of our parent company. Some of our articles include helpful links. If you purchase one of these links, we will be able to make a donation.


Source link

Related Articles

Leave a Reply

Back to top button