In 2015, Salesforce researchers worked in the basement under the Palo Alto West Elm furniture store and developed a prototype that would later become Einstein. The model is Salesforce’s AI platform that can predict products. As of November, Einstein provided more than 80 billion forecasts for thousands of businesses and millions of users every day. However, although the technology is still at the core of Salesforce’s business, it is just one of many research areas under the jurisdiction of Salesforce Research, Salesforce’s AI research and development department.
Salesforce Research’s mission is to advance AI technology and pave the way for new products, applications and research directions, which is the product of Salesforce CEO Mark Benioff’s commitment to AI as a revenue driver. In 201
Today, Salesforce Research’s work has spanned multiple fields, including computer vision, deep learning, speech, natural language processing and reinforcement learning. The department’s projects are by no means purely commercial, ranging from drones that use AI to identify great white sharks to systems that can identify signs of breast cancer from tissue images. Even when the pandemic forces Salesforce scientists to leave the office for the foreseeable future, work continues. Just last year, Salesforce Research released an environment-AI economist-to understand how AI can improve economic design, tools to test the robustness of natural language models, and a framework to clarify the uses, risks, and biases of AI models.
According to Marco Casalaina, Einstein’s general manager, most of Salesforce Research’s work falls into one of two categories: pure research or applied research. Pure research includes things like AI economists, which are not immediately relevant to the tasks that Salesforce or its customers complete today. On the other hand, applied research has clear business motivations and use cases.
Voice is a particularly active subfield in Salesforce Research applied research. Last spring, as more and more customer service representatives were ordered to work from home in Manila and other places in the United States, some companies began to turn to AI to bridge the resulting service gap. Casalaina said this stimulated work on the call center of the Salesforce business.
“We are doing a lot of work for our customers… Regarding real-time voice prompts. We will provide the entire training process for customer service representatives after the conference call,” Casalaina told VentureBeat in a recent interview. “This technology can identify the moment, whether it is good or bad, but it can be guided in some way. We are also developing many features, such as automatic upgrade and summary, and use the content of the call to pre-fill fields for you, and make Life is easier.”
Richard Socher, the former chief scientist of Salesforce, told VentureBeat in a telephone interview that AI with healthcare applications is another pillar of Salesforce’s research. Socher joined Salesforce after the acquisition of MetaMind in 2016. He left Salesforce Research in July 2020 to set up search engine startup You.com, but is still an honorary scientist of Salesforce.
Socher said: “Especially medical computer vision may have a great impact.” “Interestingly, the human visual system may not be able to read the three dimensions of X-ray, CT scan, MRI scan, or more. The important thing is cell images that may indicate cancer…The challenge is to predict diagnosis and treatment.”
To develop, train, and benchmark predictive healthcare models, Salesforce Research extracts data from a proprietary database that contains tens of megabytes of data collected from US clinics, hospitals, and other points of care. Salesforce Research’s medical AI said that Salesforce is committed to adopting privacy protection technologies such as federated learning to ensure the anonymity of patients.
Esteva told VentureBeat: “The next field is precision medicine and personalized therapy.” “It’s not only what is shown in the image or what the patient shows, but what the future of the patient will be like, especially if we decide to use it. For treatment. We use AI to get all the data of patients-their medical imaging records, their lifestyle. To make a decision, the algorithm predicts whether they will survive, die, be healthy or unhealthy, etc.”
To this end, in December, Salesforce Research’s open-source ReceptorNet is a machine learning system researcher in the department, developed in collaboration with clinicians from the Lawrence Ellison Institute for Translational Medicine at the University of Southern California. The system can determine important biomarkers for oncologists when deciding on the appropriate treatment plan for breast cancer patients. The accuracy of the study published in the study reached 92%. Nature Communications.
Usually, breast cancer cells extracted during a biopsy or surgery are tested to see if they contain proteins that act as estrogen or progesterone receptors. When the hormones estrogen and progesterone attach to these receptors, they promote the development of cancer. However, this type of biopsy image is widely used and requires a pathologist to check it.
In contrast, ReceptorNet uses hematoxylin and eosin (H&E) staining to determine hormone receptor status, which takes into account the shape, size, and structure of the cell. Salesforce researchers trained the system on H&E image slides of thousands of cancer patients from “dozens” hospitals around the world.
Studies have shown that many data used to train disease diagnosis algorithms may perpetuate inequality. Recently, a team of British scientists discovered that almost all eye disease data sets come from patients in North America, Europe and China, which means that eye disease diagnosis algorithms are not suitable for ethnic groups in underrepresented countries. In another study, Stanford University researchers found most of the data in the United States from studies involving medical uses of AI in California, New York, and Massachusetts.
But Salesforce claims that when it analyzes ReceptorNet’s age, race, and geographic-related bias signs, it finds that its performance is statically unchanged. The company also stated that the algorithm can provide accurate predictions regardless of differences in tissue sample preparation.
Socher said: “In breast cancer classification, we are able to classify certain images without the need for expensive and time-consuming dyeing processes.” “To make a long story short, this is one of the areas where AI can solve the problem, so that it can be used for the final application. It helps.”
In a related project detailed in a paper published in March last year, scientists at Salesforce Research developed an AI system called ProGen that can produce proteins in a “controllable way.” Given the required properties of a protein, such as molecular function or cellular composition, ProGen creates the protein by processing the amino acids (such as words in a paragraph) that make up the protein.
The Salesforce research team behind ProGen trained the model on a dataset of more than 280 million protein sequences and related metadata, which is the largest publicly available data. The model obtains each training sample and develops a guessing game for each amino acid. After one million rounds of training, ProGen tries to predict the next amino acid from the previous amino acids. Over time, the model learns to generate proteins in unprecedented sequences.
In the future, Salesforce researchers intend to improve ProGen’s ability to synthesize new proteins (whether undiscovered or non-existent) by honing specific protein properties.
Salesforce Research’s ethical AI work spans application research and pure research. Casalaina said that customers are becoming more and more interested in it. He said that in the past six months, he has had many conversations with customers on the ethics of AI.
In January, Salesforce researchers released Robustness Gym, whose purpose is to unify the patchwork of libraries to support natural language model testing strategies. Robustness Gym provides guidance on how certain variables can help determine the priority of which assessments to run. Specifically, it describes the impact of the task through structure and known prior assessments, as well as requirements such as test generality, fairness, or security; and constraints such as expertise, computational access rights, and human resources .
In the research of natural language, robustness testing is often the exception rather than the routine. A report found that 60% to 70% of the answers given by the natural language processing model are embedded in a certain position in the benchmark training set, which indicates that the model usually only remembers the answers. Another study found that the metrics used to benchmark AI and machine learning models are often inconsistent, tracked irregularly and have no particularly useful information.
In one case study, Salesforce Research had an emotion modeling team in a “large technology company” and used Robustness Gym to measure the deviation of its model. After testing the system, the modeling team found a performance drop of up to 18%.
In the latest study published in July, Salesforce researchers proposed a new method to reduce gender bias in word embedding, which is used to train AI models to aggregate, translate language, and perform other prediction tasks. Word embedding captures the semantic and syntactic meaning of words and the relationship with other words, which is why they are usually used in natural language processing. But they tend to inherit gender bias.
Salesforce’s solution, Double-Hard Debias, transforms the embedded space into an apparently genderless space. It converts word embeddings into “subspaces”, which can be used to find the dimensions of encoding frequency information that can interfere with the encoded gender. It then “projects” the gender component along this dimension to obtain the modified embedding before performing another de-biasing operation.
To evaluate Double-Hard Debias, the researchers tested it on the WinoBias dataset, which consists of sentences that support and oppose gender stereotypes. While preserving semantic information, Double-Hard Debias reduced the embedding bias score obtained using the GloVe algorithm from 15 (for two types of sentences) to 7.7.
Looking ahead, as this epidemic clearly shows the benefits of automation, Casalaina hopes that this will remain a core focus area of Salesforce Research. He predicts that chat bots built to answer customer questions will have more powerful functions than today, as well as robotic process automation technology that handles repetitive background tasks.
There are many figures to support Casalaina’s claim. In November, Salesforce reported that since February this year, the activity of the Einstein Robotics Competition has increased by 300%, a year-on-year increase of 680% compared to 2019. In addition, the forecast for agent assistance and service automation has increased by 700%, and the daily forecast for Einstein for Commerce in the third quarter of 2020 has increased by 300%. As for Einstein of Marketing Cloud and Einstein of Sales, email and mobile personalization forecasts rose by 67% in the third quarter, and the usage rate for converting potential customers into buyers increased by 32% Einstein’s lead score.
“The goal here (and Salesforce Research as a whole) is to eliminate people’s foundations. Casalaina said: “Models, the advantages of models, and all these things are concentrated in many ways. “But this is only 20% of the equation. 80% of it is how humans use it. “
VentureBeat’s mission is to become a digital town square for technology decision-makers to gain knowledge about transformative technologies and transactions. Our website provides important information about data technology and strategies to guide you in leading your organization. We invite you to become a member of our community, you can visit:
- Latest information on topics of interest to you
- Our newsletter
- Closed thought leader content and discounted access to our precious events, such as “transformation”
- Network functions, etc.