Doing your first Exa search.
Getting started with raw HTTP requests is dead simple. All you need to do is grab an API key and make a request to one of our API endpoints
1. Create an account and grab an API key
First generate and grab an API key for Exa here:
2. Call search
search
curl --request POST \
--url https://api.exa.ai/search \
--header 'accept: application/json' \
--header 'content-type: application/json' \
--header "x-api-key: ${EXA_API_KEY}" \
--data '
{
"query": "Here is an article about the state of search:",
"type": "neural",
"useAutoprompt": true,
"numResults": 10,
"contents": {
"text": true
}
}'
This returns the following results:
{
"autopromptString": "Here is an insightful article about the state of search:",
"results": [
{
"score": 0.2181776762008667,
"title": "The Open Secret of Google Search",
"id": "https://www.theatlantic.com/ideas/archive/2022/06/google-search-algorithm-internet/661325/",
"url": "https://www.theatlantic.com/ideas/archive/2022/06/google-search-algorithm-internet/661325/",
"publishedDate": "2022-06-20",
"author": "Charlie Warzel",
"text": "One of the most-used tools on the internet is not what it used to be. Getty; The Atlantic This article was featured in One Story to Read Today, a newsletter in which our editors recommend a single must-read from The Atlantic, Monday through Friday. Sign up for it here. A few weeks ago my house had a septic-tank emergency, which is as awful as it sounds. As unspeakable things began to burble up from my shower drain, I did what any smartphone-dependent person would: I frantically Googled something along the lines of poop coming from shower drain bad what to do. I was met with a slew of cookie-cutter websites, most of which appeared hastily generated and were choked with enough repetitive buzzwords as to be barely readable. Virtually everything I found was unhelpful, so we did the old-fashioned thing and called a professional. The emergency came and went, but I kept thinking about those middling search results—how they typified a zombified internet wasteland. Like many, I use Google to answer most of the mundane questions that pop up in my day-to-day life. And yet that first page of search results feels like it’s been surfacing fewer satisfying answers lately. I’m not alone; the frustration has become a persistent meme: that Google Search, what many consider an indispensable tool of modern life, is dead or dying. For the past few years, across various forums and social-media platforms, people have been claiming in viral posts that Google’s flagship product is broken. Search google dying on Twitter or Reddit and you can see people grousing about it going back to the mid 2010s. Lately, though, the criticisms have grown louder. In February, an engineer named Dmitri Brereton wrote a blog post about Google’s search-engine decay, rounding up leading theories for why the product’s “results have gone to shit.” The post quickly shot to the top of tech forums such as Hacker News and was widely shared on Twitter and even prompted a PR response from Google’s Search liaison, Danny Sullivan, refuting one of Brereton’s claims. “You said in the post that quotes don’t give exact matches. They really do. Honest,” Sullivan wrote in a series of tweets. Read: Be careful what you Google Brereton’s most intriguing argument for the demise of Google Search was that savvy users of the platform no longer type instinctive keywords into the search bar and hit “Enter.” The best Googlers—the ones looking for actionable or niche information, product reviews, and interesting discussions—know a cheat code to bypass the sea of corporate search results clogging the top third of the screen. “Most of the web has become too inauthentic to trust,” Brereton argued, therefore “we resort to using Google, and appending the word ‘reddit’ to the end of our queries.” Brereton cited Google Trends data that show that people are searching the word reddit on Google more than ever before. Instead of scrolling through long posts littered with pop-up ads and paragraphs of barely coherent SEO chum to get to a review or a recipe, clever searchers got lively threads with testimonials from real people debating and interacting with one another. Most who use the Reddit hack are doing so for practical reasons, but it’s also a small act of protest—a way to stick it to the Search Engine Optimization and Online Ad Industrial Complex and to attempt to access a part of the internet that feels freer and more human. Google has built wildly successful mobile operating systems, mapped the world, changed how we email and store photos, and tried, with varying success, to build cars that drive themselves. This story, for example, was researched, in part, through countless Google Search queries and some Google Chrome browsing, written in a Google Doc, and filed to my editor via Gmail. Along the way, the company has collected an unfathomable amount of data on billions of people (frequently unbeknownst to them)—but Google’s parent company, Alphabet, is still primarily an advertising business. In 2020, the company made $147 billion in revenue off ads alone, which is roughly 80 percent of its total revenue. Most of the tech company’s products—Maps, Gmail—are Trojan horses for a gargantuan personalized-advertising business, and Search is the one that started it all. It is the modern template for what the technology critic Shoshana Zuboff termed “surveillance capitalism.” The internet has grown exponentially and Google has expanded with it, helping usher in some of the web’s greediest, most extractive tendencies. But scale is not always a blessing for technology products. Are we wringing our hands over nothing, or is Google a victim of its own success, rendering its flagship product—Search—less useful? One can’t really overstate the way that Google Search, when it rolled out in 1997, changed how people used the internet. Before Google came out with its goal to crawl the entire web and organize the world’s information, search engines were moderately useful at best. And yet, in the early days, there was much more search competition than there is now; Yahoo, Altavista, and Lycos were popular online destinations. But Google’s “PageRank” ranking algorithm helped crack the problem. The algorithm counted and indexed the number and quality of links that pointed to a given website. Rather than use a simple keyword match, PageRank figured that the best results would be websites that were linked to by many other high-quality websites. The algorithm worked, and the Google of the late 1990s seemed almost magical: You typed in what you were looking for, and what you got back felt not just relevant but intuitive. The machine understood. Most people don’t need a history lesson to know that Google has changed; they feel it. Try searching for a product on your smartphone and you’ll see that what was once a small teal bar featuring one “sponsored link” is now a hard-to-decipher, multi-scroll slog, filled with paid-product carousels; multiple paid-link ads; the dreaded, algorithmically generated “People also ask” box; another paid carousel; a sponsored “buying guide”; and a Maps widget showing stores selling products near your location. Once you’ve scrolled through that, multiple screen lengths below, you’ll find the unpaid search results. Like much of the internet in 2022, it feels monetized to death, soulless, and exhausting. I cover Google for a living so I am obviously aware how the results page has evolved over the years. Today, I was searching for “hearing aids” for my dad on my phone and I was stunned by the number of ads, and non-link results. It’s pretty stunning pic.twitter.com/jZZzDWRzdO — Daisuke Wakabayashi (@daiwaka) March 13, 2022 There are all kinds of theories for those ever-intrusive ads. One is that the cost-per-click rates that Google charges advertisers are down, because of competition from Facebook and Amazon (Google is rolling out larger commerce-search ad widgets in response this year) as well as a slowdown in paid-search-result spending. Another issue may stem from cookie-tracking changes that Google is implementing in response to privacy laws such as Europe’s General Data Protection Regulation and the California Consumer Privacy Act. For the past two years, Google has been planning to remove third-party cookies from its Chrome browser. And though Google Search won’t be affected by the cookie ban, the glut of search ads might be an attempt to recoup some of the money that Google stands to lose in the changes to Chrome. If so, this is an example of fixing one problem while creating another. But when I suggested this to Google, the company was unequivocal, arguing that “there is no connection” between Chrome’s plans to phase out support for third-party cookies and Search ads. The company also said that the number of ads it shows in search results “has been capped for several years, and we have not made any changes.” Google claims that, “on average over the past four years, 80 percent of searches on Google haven’t had any ads at the top of search results.” Any hunt for answers about Google’s Search algorithms will lead you into the world of SEO experts like Marie Haynes. Haynes is a consultant who has been studying Google’s algorithms obsessively since 2008. Part of her job is to keep up with every small change made by the company’s engineers and public communication by Google’s Search-team blog. Companies that can divine the whims of Google’s constantly updated algorithms are rewarded with coveted page real estate. Ranking high means more attention, which theoretically means more money. When Google announced in October 2020 that it would begin rolling out “passage indexing”—a new way for the company to pull out and rank discrete passages from websites—Haynes tried to figure out how it would change what people ultimately see when they query. Rather than reverse engineer posts to sound like bot-written babble, she and her team attempt to balance maintaining a page’s integrity while also appealing to the algorithm. And though Google provides SEO insiders with frequent updates, the company’s Search algorithms are a black box (a trade secret that it doesn’t want to give to competitors or to spammers who will use it to manipulate the product), which means that knowing what kind of information Google will privilege takes a lot of educated guesswork and trial and error. Charlie Warzel: The internet Google left behind Haynes agrees that ads’ presence on Search is worse than ever and the company’s decision to prioritize its own products and features over organic results is frustrating. But she argues that Google’s flagship product has actually gotten better and much more complex over time. That complexity, she suggests, might be why searching feels different right now. “We’re in this transition phase,” she told me, noting that the company has made significant advancements in artificial intelligence and machine learning to decipher user queries. Those technical changes have caused it to move away from the PageRank paradigm. But those efforts, she suggested, are in their infancy and perhaps still working out their kinks. In May 2021, Google announced MUM (short for Multitask Unified Model), a natural-language-processing technology for Search that is 1,000 times more powerful than its predecessor. “The AI attempts to understand not just what the searcher is typing, but what the searcher is trying to get at,” Haynes told me. “It’s trying to understand the content inside pages and inside queries, and that will change the type of result people get.” Google’s focus on searcher intent could mean that when people type in keywords, they’re not getting as many direct word matches. Instead, Google is trying to scan the query, make meaning from it, and surface pages that it thinks match that meaning. Despite being a bit sci-fi and creepy, the shift might feel like a loss of agency for searchers. Search used to feel like a tool that you controlled, but Google may start to behave more like, well, a person—a concierge that has its own ideas and processes. The problematic effects of increased AI inference over time are easy to imagine (while I was writing this article, a Google researcher went viral claiming he’d been placed on administrative leave after notifying the company that one of its AI chatbots—powered by different technology—had become sentient, though the company disagrees). Google could use such technology to continue to lead people away from their intended searches and toward its own products and paid ads with greater frequency. Or, less deviously, it could simply gently algorithmically nudge people in unexpected directions. Imagine all the life decisions that you make in a given year based on information you process after Googling. This means that the stakes of Google’s AI interpreting a searcher’s intent are high. Read: Google’s ‘sentient’ chatbot is our self-deceiving future But some of Google’s lifeless results are made by humans. Zach Verbit knows what it’s like to serve at the pleasure of Google’s Search algorithms. After college, Verbit took a freelance-writing gig with the HOTH, a marketing company that specializes in search-engine optimization. Verbit’s “soul crushing” job at the HOTH was to write blog posts that would help clients’ sites rank highly. He spent hours composing listicles with titles like “10 Things to Do When Your Air-Conditioning Stopped Working.” Verbit wrote posts that “sounded robotic or like they were written by somebody who’d just discovered language.” He had to write up to 10 posts a day on subjects he knew nothing about. Quickly, he started repurposing old posts for other clients’ blogs. “Those posts that sound like an AI wrote them? Sometimes they’re from real people trying to jam in as many keywords as possible,” Verbit told me. That his hastily researched posts appeared high in search results left him dispirited. He quit the job after a year, describing the industry of search-gaming as a house of cards. His time in the SEO mines signaled to him the decline of Google Search, arguably the simplest, most effective, and most revolutionary product of the modern internet. “The more I did the job, the more I realized that Google Search is completely useless now,” he said. HOTH’s CEO, Marc Hardgrove disputed the notion that its client blog posts were “over-optimized” for SEO purposes and that the company discourages jargony posts as they don’t rank as high. “Overusing keywords and creating un-compelling content would be detrimental to our success as an SEO company, he wrote in an email. “That’s why The HOTH does not require, or even encourage, the writers we work with to overuse keywords into their blog posts to help with optimization.” Google is still useful for many, but the harder question is why its results feel more sterile than they did five years ago. Haynes’s theory is that this is the result of Google trying to crack down on misinformation and low-quality content—especially around consequential search topics. In 2017, the company started talking publicly about a Search initiative called EAT, which stands for “expertise, authoritativeness, and trustworthiness.” The company has rolled out numerous quality rater guidelines, which help judge content to determine authenticity. One such effort, titled Your Money or Your Life, applies rigorous standards to any pages that show up when users search for medical or financial information. “Take crypto,” Haynes explained. “It’s an area with a lot of fraud, so unless a site has a big presence around the web and Google gets the sense they’re known for expertise on that topic, it’ll be difficult to get them to rank.” What this means, though, is that Google’s results on any topic deemed sensitive enough will likely be from established sources. Medical queries are far more likely to return WebMD or Mayo Clinic pages, instead of personal testimonials. This, Haynes said, is especially challenging for people looking for homeopathic or alternative-medicine remedies. There’s a strange irony to all of this. For years, researchers, technologists, politicians, and journalists have agonized and cautioned against the wildness of the internet and its penchant for amplifying conspiracy theories, divisive subject matter, and flat-out false information. Many people, myself included, have argued for platforms to surface quality, authoritative information above all else, even at the expense of profit. And it’s possible that Google has, in some sense, listened (albeit after far too much inaction) and, maybe, partly succeeded in showing higher-quality results in a number of contentious categories. But instead of ushering in an era of perfect information, the changes might be behind the complainers’ sense that Google Search has stopped delivering interesting results. In theory, we crave authoritative information, but authoritative information can be dry and boring. It reads more like a government form or a textbook than a novel. The internet that many people know and love is the opposite—it is messy, chaotic, unpredictable. It is exhausting, unending, and always a little bit dangerous. It is profoundly human. But it’s worth remembering what that humanity looked like inside search results. Rand Fishkin, the founder of the software company SparkToro, who has been writing and thinking about search since 2004, believes that Google has gotten better at not amplifying conspiracy theories and hate speech, but that it took the company far too long. “I don’t know if you searched for holocaust information between 2000 and 2008, but deniers routinely showed up in the top results,” he told me. The same was true for Sandy Hook hoaxers—in fact, campaigns from the Sandy Hook families to fight the conspiracy theories led to some of the search engine’s changes. “Whenever somebody says, ‘Hey, Google doesn’t feel as human anymore,’ all I can say is that I bet they don’t want a return to that,” Fishkin said. Google Search might be worse now because, like much of the internet, it has matured and has been ruthlessly commercialized. In an attempt to avoid regulation and be corporate-friendly, parts of it might be less wild. But some of what feels dead or dying about Google might be our own nostalgia for a smaller, less mature internet. Sullivan, the Search liaison, understands this longing for the past, but told me that what feels like a Google change is also the search engine responding to the evolution of the web. “Some of that blog-style content has migrated over time to closed forums or social media. Sometimes the blog post we’re hoping to find isn’t there.” Sullivan believes that some of the recent frustrations with Google Search actually reflect just how good it’s become. “We search for things today we didn’t imagine we could search for 15 years ago and we believe we’ll find exactly what we want,” he said. “Our expectations have continued to grow. So we demand more of the tool.” It’s an interesting, albeit convenient, response. From the July/August 2008 issue: Is Google making us stupid? Google has rewired us, transforming the way that we evaluate, process, access, and even conceive of information. “I can’t live without that stuff as my brain is now conditioned to remember only snippets for Google to fill in,” one Reddit user wrote while discussing Brereton’s “Google Is Dying” post. Similarly, Google users shape Search. “The younger generation searches really differently than I do,” Haynes told me. “They basically speak to Google like it’s a person, whereas I do keyword searching, which is old-school.” But these quirks, tics, and varying behaviors are just data for the search giant. When younger generations intuitively start talking to Google like it’s a person, the tool starts to anticipate that and begins to behave like one (this is part of the reason behind the rise of humanized AI voice assistants). Fishkin argues that Google Search—and many of Google’s other products—would be better with some competition and that Search’s quality improved the most from 1998 to 2007, which he attributes to the company’s need to compete for market share. “Since then,” he said, “Google’s biggest search innovation has been to put more Google products up front in results.” He argues that this strategy has actually led to a slew of underwhelming Google products. “Are Google Flights or Google Weather or Google’s stocks widget better than competitors? No, but nobody can really compete, thanks to the Search monopoly.” “Is Google Search dying?” is a frivolous question. We care about Search’s fate on a practical level—it is still a primary way to tap into the internet’s promise of unlimited information on demand. But I think we also care on an existential level—because Google’s first product is a placeholder to explore our hopes and fears about technology’s place in our life. We yearn for more convenience, more innovation, more possibility. But when we get it, often we can only see what we’ve lost in the process. That loss is real and deeply felt. It’s like losing a piece of our humanity. Search, because of its utility, is even more fraught. Most people don’t want their information mediated by bloated, monopolistic, surveilling tech companies, but they also don’t want to go all the way back to a time before them. What we really want is something in between. The evolution of Google Search is unsettling because it seems to suggest that, on the internet we’ve built, there’s very little room for equilibrium or compromise."
},
{
"score": 0.1931779384613037,
"title": "Search is Eating The World",
"id": "https://medium.com/@softwaredoug/search-is-eating-the-world-1c3dbdfe9b83",
"url": "https://medium.com/@softwaredoug/search-is-eating-the-world-1c3dbdfe9b83",
"publishedDate": "2015-04-15",
"author": "Doug Turnbull",
"text": "Few recognize that the search bar, not software, is today’s core value proposition. The following is a modified excerpt form “ Taming Search ” from Manning Publications. Use discount code 38turnbull to get 38% off the book. Venture Capitalist Marc Andreessen famously wrote that “Software is Eating The World”. In his Wall Street Journal article, Andreessen rifles through a number of industries that software has revolutionized. Borders famously gave up online retail to Amazon, writing it off as unimportant. Netflix eviscerated Blockbuster’s brick-and-mortar rental monopoly. The recording industry has been challenged by the likes of iTunes and Spotify. Andreessen highlights dozens of traditional and brick-and-mortar businesses that failed to cope with the dominance of software. What do Andreesen’s examples have in common? Much of the fundamental value shift has actually been towards search, not software. Search is eating the world. Amazon beat Borders because a well-tuned search engine can take me directly to a title I’m searching for faster than a teenager lost in limited stacks of books. Even if it once took days for a DVD to arrive, I’d rather use Netflix’s search-driven movie rental experience to narrow-in on a movie I’d like to watch instead of being limited by a store’s shelf space and overbearing staff. In many of Andreesen’s examples there’s a search engine doing software’s heavy lifting — fundamentally responsible for more efficiently delivering value to users and businesses. Search applications like Web, E-commerce, and research applications, attempt to replace traditional methods that have typically acted as a gateway to products and information. Search engines replace librarians and sales representatives. Those that have done this succesfully find business gold, those that don’t are stuck in mediocrity. Yahoo famously focused on their human curated Web directory of vetted content acting more like a librarian for the Web than what we now take for granted in Google’s amazing search experience. Now Google experiences much greater success while Yahoo struggles to catch up. This is just the tip of the iceberg. The applications of search are only limited by our imagination. In law, armies of paralegals once combed through extensive legal tomes for answers. Now these tomes gather dust like old bowling trophies. They've been replaced by smarter, precise legal search used by a small, elite cadre of lawyers. When we shop for real estate, we don’t peruse pages of housing publications anymore. We use a search interface that ranks properties based on several pieces of criteria like square footage, school zone, and number of bedrooms. The list goes on. Despite the fact that search-based user experiences are revolutionizing our lives, so many fail to grasp how their terrible search drives customers away. Sadly, many organizations are slow to change and unable to see the competitor chomping at their heels with a better and smarter search user experience. This blindness isn’t limited to the business — developers obsess over their application’s navigation, and neglect the search bar as an afterthought for finding content. Its scary how easy it is for the organization to ignore search. So few understand what it takes to build a search user experience that understands users, their vernacular, and your content. It’s easy for all parts of the organization to focus on what they know — old forms of navigation and discovery. The art of relevance engineering — building search fluent in the subtleties of your content, users, and user experience— is unlike any other form of software developement. Its outside our comfort zone, but it can be taught. So, regardless of much search fails you today; despite the unique challenge to building user intelligence or domain fluency into your search application, you must keep trying. You must pull out all the stops to avoid being the next Blockbuster or Borders. Don’t be surprised, be ahead of the curve. Build a practice of search expertise. Put search front and center. Don’t bury it. Learn how to treat it as a profit center, not a cost anchor. Because, as history has shown, those with relevant search will prosper. The slow and irrelevant die. If you enjoyed this, read more about “Taming Search” and the art of building fluent, relevant search applications here ."
},
{
"score": 0.18812710046768188,
"title": "What it Takes to Build a Really Great Search Engine | HackerNoon",
"id": "https://hackernoon.com/what-it-takes-to-build-a-really-great-search-engine",
"url": "https://hackernoon.com/what-it-takes-to-build-a-really-great-search-engine",
"publishedDate": "2022-06-12",
"author": "Adrian H Raudaschl",
"text": "Building a successful search takes more than technical smarts, but if done right is one of the most rewarding products you could work on.\nEverything is a search problem. From finding the keys in our pocket (thanks, AirTag) to looking up a long-lost lover on LinkedIn — seeking things and answers is a problem that predates civilization.\nWhen historians look back at our lives, they will proclaim how this was the time when humans industrialized search on a scale that touched every facet of our lives. Our era marks the transition from search being used for survival to one of expanding our knowledge. It’s hard not to see the impact.\nAn average person conducts three and four searches each day; more times than I can remember to drink water. Google.com is the most dominant website on the planet, logging approximately 5.6 billion searches daily. We have been looking for answers to life’s questions for centuries, but it’s never been so easy, expansive, and quick to do so. To cope with the newfound convenience, we have honed skills for sifting the informational zeitgeist and learning to pose questions in ways a machine can understand. \n“You have people walking around with all the knowledge of humanity on their phone, but they have no idea how to integrate it. We don’t train people in thinking or reasoning.” David Epstein, Range\n\nServers, algorithms, and programming languages aid us in this struggle, a compass for mapping the optimal route from query to knowledge. For those of us building such search tools, our problem is knowing that when our users search for anything, there are few guarantees we will provide relevant answers. So, how best to approach this? The secret to a great search\nBeing an excellent search product team is the same as being a great team for anything else. The principles are almost cliche: Understand and help your users solve a valuable problem and work backward from there.\nWhen my team and I built our first search engine on Mendeley (Reference management software for researchers), the key to our unexpected success was understanding that search was not just about helping academics find research papers. What was important was understanding that finding relevant results for our users was key to a better onboarding experience, leading to improved user retention and product growth. Our first search product: Mendeley Search. A search to help academics find papers. Thanks to focusing on longer-term product impact, we didn’t get distracted by arbitrary goals like optimizing result relevancy. Instead, we concentrated more on the holistic user experience and prioritized features that built trust, such as explaining result relevancy or assisting users with choosing better keywords. Had we doubled down on search vanity metrics like response speed or % of clicks in the first 10 results, I doubt we would have achieved our double-digit growth and engagement. I’ve noticed that many who enter the field of search have technical backgrounds and t’s not hard to understand why. Search products are inherently more technical, speculative and experimental than many other user-facing features.\nHaving the skills to navigate the technical complexities of search, so you understand its limitations and capabilities is a significant advantage when planning and working with engineers. However, people tend to over-index on this knowledge, which blinds us to opportunities. For example, when building our search, the goal was to help our users become more productive. Doing a good job, we reasoned would increase user engagement, retention and satisfaction. As a result, everything we chose to learn, research, build and measure around search was based on those goals, not vanity relevancy metrics like DCG (Discounted cumulative gain — a fancy way of measuring the number of clicks on the first couple of results). Our role is to connect a love of the problem with the available technology. We need to be technical enough to understand the vocabulary of engineers, but not so much as to place restrictions on our goals based on any present technical limitations. To combat this, I would instead recommend planning around any given technology trend rather than its current absolute, e.g. Graph databases and general language models. Aim to build a product roadmap that anticipates what is likely to exist in the medium-term future, and you will always be prosperous. The other thing to understand is that everything in search is speculative. Users provide many perspectives, which ensure the quality of results will be interpreted very differently. Rarely are there single correct answers for everyone (a bit like real life). Our users could provide different satisfaction levels for the same results for the same query on the same day. Finding peace with the ambiguity of search work means discovering ways to reduce uncertainty. The best way to achieve this is by focusing on skills that facilitate robust working methods and ensure strong teamwork. In other words, practicing the “art” of product management in search more valuable than the “science”: communicating, empathy, leading without authority, having difficult conversations, storytelling, making decisions when you don’t have all the information, dealing with ambiguity, inspiring others, and connecting deeply with users. I can’t emphasize enough not to fall into traps like ensuring response times need to be under 100 ms or results need to be 80% accurate or whatever three-letter acronym metric is currently in vogue: DCG, eDCG, Rank K, MAP. I mean, be aware of them, but shouldn’t let them dictate goals. We need to remain user-focused, not search-focused. In many ways, search products are a sum of destructions. Like any good product team, we need to predict a little bit at a time and then verify. If you find yourself stuck for solutions, the best advice I can offer is to approach complex search problems by building a platform where people can provide their raw components and tailor them to their needs. For our teams, that means creating clarity and access to relevant information that help people make autonomous decisions. For our users, that means building flexibility into our tools to allow users to identify novel ways of solving their problems. We don’t always need to understand all the complexities of a user’s problem. Sometimes we just need to build the tools, features, and ways of working that enable us to do things independently. Search is the Best Product\nInterestingly, search changes our perception of the world and our ability to live within it.\nIt changes to text, changes reading (thank you snippets), how to understand, and even what it means to be literate. Our readers can now easily find connections that an author never intended.\n\n“Search engines have become sense making engines, helping to chart, connect and explore infinite textual maps.” Anne-Laure Le Cunff , founder of Ness Labs\n\nReally what a search is, is a bunch of little problems with a bunch of little solutions. Building such products brings the responsibility of knowing that we enable anyone to form opinions that justify their actions, beliefs, and lives. These are opinions from which we formulate praise for our friends and contempt for our enemies, our long-term projects, our deepest self-doubts, and our highest hopes. (Paraphrasing philosopher Richard Rorty, 1989)\nOur greatest challenge as search people is that users are messy. We have to work knowing that user queries are highly variable and usually not well formulated. The challenge becomes understanding a user’s intent when even the user doesn’t know what they want. Philosophy rarely rhymes with software engineering, but a big part of building a search is figuring out what your user means. In the hope of more clicks, I see teams building ever more complex personalization, semantic and segmentation models that aim to filter out irrelevance and boost the familiar. Morally, I believe this is a mistake.\nRecommender or historical click boosting solutions built into search assume that users wish their future results to depend on historical behavior. It’s an assumption that if executed over a long enough time robs individuals of optionality and limits our ability to express ourselves. One of my favorite philosophers, Isiah Berlin, would call this an attack on our final vocabulary — the set of words we carry about that justifies our actions, our beliefs, our sense of the world, and our place within it). We would call this an “echo chamber”. If you plan to do such things, always ensure you do so transparently. Medium.com, for example, released a feature that allows you to understand and refine your recommendations. It’s not only good for society but also good for business. I’ve found that the primary source of a company’s dominance is whether it designs its product and business model to be perfectly aligned with its customers’ interests. “If your model will suffer from perfect transparency with customers, you’re not in an unbeatable position,” says venture capitalist Jesse Beyroutey. Building a great search is the same as building anything else that’s great: assess the opportunity, then define what needs to be made. My day looks like many other product managers: I get up in the morning, check metrics, prioritize tasks for sprints, write user stories, refine with my team, and conduct research. The qualities of a great search team are adaptability, optimism, and humility. If you can fall in love with the problem, then being voracious in taking the time to learn and understand what makes search work becomes second nature.\nHere are some resources I strongly recommend: \n Relevant Search, Doug Turnbull and John Berryman (Book)\n What Every Software Engineer Should Know About Search, Max Grigorev\n Measuring Search: Metrics Matter, James Rubinstein\n On Search Leadership, Daniel Tunkelang\n Search Product Management: The Most Misunderstood Role in Search? James Rubinstein\n The Future of Text, Frode Hegland (Book)\n\nWhat’s so satisfying about working on search is knowing that the tool you build brings people so much value. There are many exciting problems and pieces of technology to play with, plus, you get to work with intelligent people (engineers, analysts, data scientists). Don’t feel discouraged if you don’t think you have the technical know-how for the job. A great advantage non-technical product managers bring is a problem-first approach. If nothing else, this is the most crucial skill I find technical PMs forget when working through tricky issues. Though in their defense, one could argue I’m simply being naïve and merely passing the hard work onto others.\nThe downside of working with search is its ability to induce a sense of relative deprivation upon its makers. The pace of change is fast, with new groundbreaking models or frameworks released monthly. There are many technological distractions, and it’s easy to feel like the world is leaving you behind. Most of your daily work is less cutting-edge AI and more keeping data up to date, deduplicated, synced, and working at scale. Plus, users will keep coming up with creative ways to break your service. However, the great thing about search is it’s not a solved problem. Every search solution can look and feel very different. Doing a great job at search can mean the difference between a person having an informational revelation or missing a lifetime opportunity. How often can you say you were able to work on something that achieved that? \n“No learner has ever found that he ran short of subjects to explore. But many people who avoided learning, or abandoned it, find that life is drained dry” Maria Montessori, Italian physician\n\nAlso published here ."
},
{
"score": 0.18329566717147827,
"title": "SEO Tactics Die, But SEO Never Will",
"id": "https://moz.com/blog/seo-tactics-die-but-seo-never-will",
"url": "https://moz.com/blog/seo-tactics-die-but-seo-never-will",
"publishedDate": "2013-06-13",
"author": "Dr Peter J Meyers",
"text": "This\nis a post that has been gnawing at the edges of my brain for years, and I think\nthe time has finally come to write it. Our recent Moz re-brand launched the\ninevitable 4,789th wave (and that’s just this year) of \"SEO Is Dead\" posts.\nThis isn't a post about our reasons for broadening our brand (Rand has talked extensively about that)\n– it’s a post about why I think every declaration of SEO's demise misses\nsomething fundamental about our future. This is going to get philosophical, so\nif you’d rather go make a sandwich, I won’t stop you. \nThe Essence of\nSearch\nLet’s\nstart with a deceptively simple question – How big is the internet? I’ll\nattempt to answer that by creating a graph that borders on being silly: \nThe\ninternet is so big that even Google got tired of counting,\nand it's growing exponentially. Five years have passed since they announced the trillion mark, and the article suggests that URL variations now make the\npotential indexed page count theoretically infinite.\nWe\ncan't just print out the internet and read it at our leisure. We need a filter –\na way to sift and sort our collected content – and that's essentially all that\nsearch is. However search evolves or whatever happens to Google, the expansion\nof human knowledge is accelerating. Unless we suffer a technological cataclysm,\nwe will need search, in some form, for the rest of human history.\nSearchers and\nSearchees\nAs\nlong as search exists, it also stands to reason that there will be two groups\nof people: (1) People who want to find things, and (2) People who want to be\nfound. On any given day, we may each be both (1) and (2), and the \"people\" who\nwant to be found could be businesses, governments, etc., but for every search\nthere will be some entity who wants to have a prominent position in that search\nresult.\nThe\ndesire to be found isn't new or unique to online search – just ask Melvil Dewey or call up \"AAA\nAardvark Plumbing\" in the Yellow Pages. What's unique to online search is that\nthe system has become so complex that automated technology governs who gets\nfound, and as the scope of information grows, that's not about to change. Ultimately,\nwhenever a system controls who will be found, then there will be a need for\npeople who understand that system well enough to help entities end up on the\nshort list.\nThis\ngoes beyond manipulative, \"black hat\" practices – data needs to be structured,\nrules complied with, and many pieces put into place to make sure that the\ninformation we put out there is generally friendly with the systems that\ncatalog and filter it. Over time, these systems will get more sophisticated,\nbut they will never be perfect. As long as search exists, there will be a need\nfor experts who can optimize information so that it can be easily found.\nSEO Is Not One\nTactic\nWhen\nwe say \"SEO Is Dead!\", we’re usually reacting to the latest tactical fad or\nannouncement from Google. Ultimately, though, SEO is not one tactic and even\nthough Google currently dominates the market, SEO doesn't live and die with\nGoogle. I'm 42 years old, and the public internet as we know it now hasn't\nexisted for even half of my life. Google is a teenager, and I strongly suspect\nI'll outlive them (or at least their dominance). \nThere's\nno doubt that search is changing, and our industry is barely out of its infancy.\nIn the broad sense, though, the need for people who can help construct findable\ninformation and attract people to that information will outlive any single tactic, any\nindividual SEO expert, and even any search engine.\nThe Construct: Search\nin 2063\nSergei\nhad spent his entire adult life learning how to manipulate The Construct. Fifteen\nyears earlier, the unthinkable had happened – the collected knowledge of\nhumanity had grown so quickly that there was no longer enough space in the\naccessible universe to store it in. The internet became The Construct, and it now\nspanned both space and time. \nSince\nno human could adequately comprehend 4-dimensional data (early attempts at neural\ninterfaces drove a few pioneers to insanity), The Construct had to be projected\nonto a 3-dimensional orb suspended in a vacuum, affectionately known as the “space\negg.” With more than a decade of practice, Sergei manipulated the egg like an\nomelette chef at a 5-star brunch, and what his clients paid him made their $37\nmimosas look reasonable.\nThis\nmorning was worse than most. The Construct’s AI had detected an unacceptable\nlevel of manipulation and was adjusting the Core Algo. Sergei could already\nsee the surface of the egg being rewritten, and the change was costing his\nclients millions with every passing minute. Luckily, his defensive bots were\nalready at work, rewriting semantic data to conform to the ripples in the Algo. One thing was certain: the life of a Space Egg Optimizer was never dull."
},
{
"score": 0.18324322998523712,
"title": "System: Reinventing search for research",
"id": "https://about.system.com/blog/reinventing-search-for-research",
"url": "https://about.system.com/blog/reinventing-search-for-research",
"publishedDate": "2023-05-04",
"author": "Adam Bly",
"text": "For seemingly the first time in the 25 years since Googleâs founding, search is ready to be disrupted. Several forces, all intersecting at this very moment, set the stage for reinvention: the increasing bankruptcy of the canonical âthousand blue linksâ experience; the emergence of large language models; the critical factual failures of the new chatbots; the continued explosion of data and information; and finally, our awareness of the rising complexity and interdependence of the most important issues we face as individuals, organizations, and society. In the past few months, weâve seen an explosion of ideas around how to improve search. Iâll call the dominant search paradigm âThe Google Wayâ and the emerging search paradigm âThe ChatGPT Wayâ.\n\nThe Google Way \n The ChatGPT Way\n\nUsers search in a search bar \n Users ask questions in a search bar\n\nSearch returns a list of semantically relevant objects with links to original sources, from a corpus of webpages indexed by the search engine \n Search generates an answer as a block of text, based on training data from some fixed moment in time\n\nObjects are generally ranked based on the âimportanceâ of the source (the number and âimportanceâ of other sources that link to it)\n\nâ Itâs exciting to be living in a moment where a fundamental building block of the modern information technology stack is ripe for reinvention and improvement, and already it feels inevitable that the new paradigm will be inextricably woven into the fabric of everyday life. For our purposes, though, Iâll discuss search for professional use, from health to finance, where the accuracy, reliability, and provenance of information matters considerably, and where decision-making is usually expected to be rational and data-driven. From that standpoint, itâs clear that search The Google Way, which is employed in professional search engines like Web of Science and Scopus, is no longer fit for purpose: helping us find the best, most trustworthy, most complete information to make the best possible decisions. But I would suggest that search The ChatGPT Way isnât fit for that purpose either, precisely because of what it has in common with the paradigm itâs disrupting. â The problems with search Iâll organize the problems I see with search into two classes: The depth problem â How do we solve for the volume of information? The breadth problem â How do we avoid losing the forest for the trees? These two problems share something in common: they both stem from an overreliance on language in how search engines organize and retrieve or generate information. In their most basic form, The Google Way finds things that match keywords, and The ChatGPT Way predicts the best next word in a sentence. In other words: language is the basis for retrieval in The Google Way, while language is the basis for retrieval as well as the form of the answer in The ChatGPT Way. But language, in and of itself, has no reliable conception of how the world works, no empirical sense of cause and effect, no understanding of a system. Language is not the same thing as fact. As the computer scientist David Waltz put it to Esther Dyson: \"Words are not in themselves carriers of meaning, but merely pointers to shared understandings.\" Any search paradigm based primarily on language, then, runs the risk of misrepresenting the world it purports to document. (See Galactica, the LLM for science that Meta released and promptly shut down last year.) How we organize and discover information for professional use â and ultimately, how we make decisions â needs to change. The depth problem The volume of information that professionals need to be familiar with continues to grow superlinearly. By one measure, the volume of scientific information now doubles every 17 years, while the amount of medical research doubles every 5 years â but the tools available for navigating that information The Google Way havenât changed in decades. The physicians we spoke to repeatedly referred to the current paradigm as the âhammer and chisel methodâ of finding papers; itâs slow and unwieldy, requiring an extraordinary amount of manual labor. Search engines like Google Scholar and PubMed, then, are not just unable to solve the depth problem, but get worse at it every year as the corpus of research grows. Simply put, there are just too many links to dig through. And by reinforcing biases towards certain types of information sources through pagination and ranking factors, The Google Way discourages the kind of wide-ranging inquiry that is essential to progress. The ChatGPT Way is equally unsuited to the task of solving for depth, albeit for very different reasons. â\n\nSolving the depth problem \nthe Google Way \n Solving the depth problem \nthe ChatGPT Way\n\nA thousand blue links \n An answer in text form\n\nSearch produces an endless list of links that takes too much time to sort through Many searches never get to Page 2 of the results, leading to self-reinforcing biases in information sources Ranking is biased (e.g. PageRank, Impact Factor) and these biases are self-reinforcing \n Has no model of the world to lean on and correct itself There are no or few citations to back up the answer provided; uncited means untrustable Trained to sound authoritative Prone to hallucinations\n\nâ LLMs, as Ezra Klein recently wrote, âare built to persuade,â despite the fact that theyâre not guided by fidelity to a model of the world as it actually is. âTheyâre not trained to predict facts,â A.I. ethics researcher Margaret Mitchell told Klein. âTheyâre essentially trained to make up things that look like facts.â As Bender et al argue, this âersatz fluency and coherenceâ heightens the risk of automation bias, which could entail significant real-world harm. From the standpoint of search in a professional context, The ChatGPT Way should give us serious pause. LLMs hallucinate (or fabricate); they invent citations that donât exist. And yet they proceed with the confidence and certainty of that one acquaintance who âdoes their own research,â blithely asserting what philosopher Harry Frankfurt would define as pure bullshit â content with no concern for the truth â as clear-cut statements of fact. The breadth problem We are living in an era of unimaginable complexity. The climate, our bodies, our cities, the economy: all are systems made up of endlessly interacting, flowing, dependent parts. But while the biggest challenges we face â at both the individual and global scale â are systemic, our data and knowledge are still organized into silos: disciplines, departments, nation-states, organs... As I wrote when we launched our public beta, this fundamental incongruity makes it nearly impossible to think, plan, and act systemically â to consider the way a system's individual parts interrelate, how systems change over time, and how they work within even larger systems. We canât navigate our âwhitewater world,â to borrow a term from John Seely Brown. We struggle with context and are stifled in our ability to reliably predict outcomes, make decisions, and mitigate risks. âHow do you look at the disturbances of the surface structure,â asked Brown in a 2017 talk at UC Irvine, âand understand something about the deep structure? How do you interpret those flows, or what lies beneath those little ripples? [...] Reading context isnât that simple. Itâs clear that our ability to read context is seriously impaired.â â Search is good at helping us find massive amounts of information about trees, but terrible at showing us the forest. â Search as we know it reinforces silos. Limited by its strictly semantic backbone, it ossifies the fragmentation of knowledge that we have encoded over time in language, failing to help us consider context. Put another way: search is good at helping us find massive amounts of information about trees, but terrible at showing us the forest. It does nothing to help us see how the parts of systems relate, to help us understand the breadth and complexity of the world as it actually is â and to anticipate what might happen next. Search The Google Way or The ChatGPT Way doesnât naturally help us consider questions like: What else do I not know to search for? What else could be causing this? What could happen if this happens? What other factors should I take into account? Instead, it strictly limits the scope of inquiry by retrieving (and producing) language that corresponds to our starting point. As a direct result, search today is technically unable to help us make the best possible decisions. It may even compound our challenges. We believe thereâs a better way. â From silos to systems In 2018, we invented and patented a new architecture to organize information based primarily on statistical relationships rather than semantic ones. You can read more here, but in brief: System is made up of nodes and edges. The nodes are measurable variables in the world (like body mass index or concentration of fine particulate matter) and the larger concepts they measure (like obesity or air pollution). The edges indicate a statistical relationship between two nodes in a specific context. Unlike classic knowledge graphs, the edges here are not primarily semantic. Instead, they are constructed around quantified evidence of association between things in the world. A causes B vs. A is a type of B. We use AI to extract statistical relationships from peer-reviewed studies, datasets, and models. You can read about the steps we take to ensure accuracy in extraction here. We organize, normalize, and link these millions (in the future, billions) of relationships so that anything in the world can be connected to everything else. Today, just over a year since we launched System in public beta, we are excited to make two major announcements. First, we have extracted the statistical results from all original studies on PubMed using AI. To the best of our knowledge, this is the first time statistical information retrieval has been achieved at this scale. This unique structured data is helping to relate tens of thousands of topics across health and biomedical research on System. Second, we are announcing the first application built on top of System: System Pro. â Purpose-built with explainability and trustworthiness at its core. â Powered by AI, System Pro reinvents search for research. Unlike other search engines and chatbots, System Pro is purpose-built with explainability and trustworthiness at its core, combining LLMs with unique structured data and a patented graph architecture. Itâs designed to help break down disciplinary silos and support systems-based research and solutions. Itâs the fastest and most reliable way of finding, synthesizing, and contextualizing scientific literature â starting in health and life sciences and then expanding to all domains of knowledge. We built it for professionals who depend on accurate, trustworthy, and up-to-date information to make decisions. You can sign up for a free trial starting today. As is our practice, we have also published the Release Risks associated with todayâs release. â Search reinvented for research What if you could get reliable, up to date, and transparently sourced answers to your searches? That could result in massive time savings. And what if it were just as easy to contextualize something as it is to find content about it? That would meaningfully improve decision-making, making it more rational and systemic, and therefore more accurate and reliable. It would reduce unintended consequences, lead to more efficient and reliable interventions and better decisions, and empower decision-makers to place bigger bets with greater confidence. Hereâs how System Pro is bringing this vision to life. â What if it were just as easy to contextualize something as it is to find content about it? â Solving the depth problem Based on your search, System Pro assembles all the relevant statistical findings weâve extracted with AI from peer-reviewed studies. We cluster those statistical findings to group findings about the same or similar variables, and convert them into natural language following strict rules. Finally, we prompt an LLM to synthesize those clustered findings, using and citing the findings alone. The end result is an accurate, trustworthy, and up-to-date overview of the relevant research. You can also quickly pinpoint where there is agreement and disagreement. â â For the first time, all sources used to generate the synthesis are clearly cited and linked back to the original source, and you can filter the synthesis to make it more relevant to your needs. All of this is only possible because of Systemâs patented architecture. We combine LLMs with unique structured data, aligning our synthesis with a model of the world as it actually is. â Syntheses are based on all the information on System, which includes the results of all original research articles on PubMed. We are continuing to expand our source material, first within health and life sciences and then beyond to all domains of knowledge. Weâre deliberately taking our time to get each knowledge space right before we move on. We think accuracy is just too important to our users and their applications. Solving the breadth problem Based on your search, System Pro recommends statistically related topics that you may want to consider to contextualize your search â both upstream and downstream from where you started. This is a brand new kind of recommendation thatâs only possible because of Systemâs statistical architecture â weâre excited to hear your feedback and develop it further. â â System Pro also assembles a systems map of your search that shows all the other topics that are statistically related (if youâve tried Systemâs public beta, this view will be familiar to you). Youâll discover new connections to help you strengthen research plans and clinical decisions. A systemic view, not a siloed one. â â System Pro marks a new approach to search for professional use: easier than The Google Way; more transparent, reliable and explainable than The ChatGPT Way; built for discovery, not just search â for the complexity and interconnectedness of the world as it actually is. We hope youâll give it a try. â\n\nHow System Pro is different\n\nReliable \n Syntheses are based exclusively on the results of peer-reviewed scientific studies Material used to generate syntheses is explicitly constrained to sentences we generate from statistical findings and metadata Only models that meet high standards of accuracy are deployed, and performance is monitored at every step\n\nTransparent \n Syntheses are clearly cited with all sources used. The average synthesis on System Pro cites an industry-leading 36 studies. Findings and studies are grouped into meaningful clusters to help you see and analyze the research landscape\n\nUp to date \n Refreshed daily with the latest research (today on PubMed)\n\nSystemic \n Recommends and visualizes topics that are part of the broader system your search falls within Directionality can be layered into searches to show you the system through the most relevant lens\n\nâ Towards the advancement of knowledge A few weeks ago, the authors of the âStochastic Parrotsâ paper (cited above) â Timnit Gebru, Emily M. Bender, Angelina McMillan-Major, and Margaret Mitchell â issued a statement in response to a widely read Future of Life Institute letter calling for an immediate pause on âgiant AI experiments.âââWe should be building machines that work for us, instead of âadaptingâ society to be machine readable and writable.â âMachines that work for us. A vision of technological progress that centers human agency. One that resonates with our own vision, here at System, of what search can and should be.âThe ChatGPT Way reduces scientific endeavor to the stuff of âunfathomable training dataâ for LLMs, abstracting away both the existence of the people who did the research that an answer is based on and the context (and possible bias) of that research. On System Pro, you see the authors and the populations they studied. Itâs not just about transparency of sources for the sake of trust, which I think is critical, but a recognition of how knowledge advances: bit by bit, standing on the shoulders of the people who came before. âSystem Pro was built to amplify human work. Instead of using AI as an end in itself, weâre using it as a tool to link together, for the first time at this scale and breadth, the work of researchers the world over. To show it all in context and make it more than the sum of its parts."
},
{
"score": 0.18305742740631104,
"title": "The end of the Googleverse",
"id": "https://www.theverge.com/23846048/google-search-memes-images-pagerank-altavista-seo-keywords",
"url": "https://www.theverge.com/23846048/google-search-memes-images-pagerank-altavista-seo-keywords",
"publishedDate": "2023-08-28",
"author": "Ryan Broderick",
"text": "The first thing ever searched on Google was the name Gerhard Casper, a former Stanford president. As the story goes, in 1998, Larry Page and Sergey Brin demoed Google for computer scientist John Hennessy. They searched Casper’s name on both AltaVista and Google. The former pulled up results for Casper the Friendly Ghost; the latter pulled up information on Gerhard Casper the person. What made Google’s results different from AltaVista’s was its algorithm, PageRank, which organized results based on the amount of links between pages. In fact, the site’s original name, BackRub, was a reference to the backlinks it was using to rank results. If your site was linked to by other authoritative sites, it would place higher in the list than some random blog that no one was citing. Google officially went online later in 1998. It quickly became so inseparable from both the way we use the internet and, eventually, culture itself, that we almost lack the language to describe what Google’s impact over the last 25 years has actually been. It’s like asking a fish to explain what the ocean is. And yet, all around us are signs that the era of “peak Google” is ending or, possibly, already over. This year, The Verge is exploring how Google Search has reshaped the web into a place for robots — and how the emergence of AI threatens Google itself. There is a growing chorus of complaints that Google is not as accurate, as competent, as dedicated to search as it once was. The rise of massive closed algorithmic social networks like Meta’s Facebook and Instagram began eating the web in the 2010s. More recently, there’s been a shift to entertainment-based video feeds like TikTok — which is now being used as a primary search engine by a new generation of internet users. For two decades, Google Search was the largely invisible force that determined the ebb and flow of online content. Now, for the first time since Google’s launch, a world without it at the center actually seems possible. We’re clearly at the end of one era and at the threshold of another. But to understand where we’re headed, we have to look back at how it all started. If you’re looking for the moment Google truly crossed over into the zeitgeist, it was likely around 2001. In February 2000, Jennifer Lopez wore her iconic green Versace dress to the Grammys, which former Google CEO Eric Schmidt would later say searches for inspired how Google Image Search functioned when it launched in summer 2001. That year was also the moment when users began to realize that Google was important enough to hijack. The term “Google bombing” was first coined by Adam Mathes, now a product manager at Google, who first described the concept in April 2001 while writing for the site Uber.nu. Mathes successfully used the backlinks that fueled PageRank to make the search term “talentless hack” bring up his friend’s website. Mathes did not respond to a request for comment. A humor site called Hugedisk.com, however, successfully pulled it off first in January 2001. A writer for the site, interviewed under the pseudonym Michael Hugedisk, told Wired in 2007 that their three-person team linked to a webpage selling pro-George W. Bush merchandise and was able to make it the top result on Google if you searched “dumb motherfucker.” “One of the other guys who ran the site got a cease and desist letter from the bombed George Bush site’s lawyers. We chickened out and pulled down the link, but we got a lot of press,” Hugedisk recounted. “It’s difficult to see which factors contribute to this result, though. It has to do with Google’s ranking algorithm,” a Google spokesperson said of the stunt at the time, calling the search results “an anomaly.” But it wasn’t an anomaly. In fact, there’s a way of viewing the company’s 25-year history as an ongoing battle against users who want to manipulate what PageRank surfaces. “[Google bombing] was a popular thing — get your political enemy and some curse words and then merge them in the top Google Image resolve and sometimes it works,” blogger Philipp Lenssen told The Verge. “Mostly for the laughs or giggles.” There’s a way of viewing the company’s 25-year history as an ongoing battle against users who want to manipulate what PageRank surfaces Lenssen still remembers the first time he started to get a surge of page views from Google. He had been running a gaming site called Games for the Brain for around three years without much fanfare. “It was just not doing anything,” he told The Verge. “And then, suddenly, it was a super popular website.” It can be hard to remember how mysterious these early run-ins with Google traffic were. It came as a genuine surprise to Lenssen when he figured out that “brain games” had become a huge search term on Google. (Even now, in 2023, Lenssen’s site is still the first non-sponsored Google result for “brain games.”) “Google kept sending me people all day long from organic search results,” he said. “It became my main source of income.” Rather than brain games, however, Lenssen is probably best known for a blog he ran from 2003 to 2011 called Google Blogoscoped. He was, for a long time, one of the main chroniclers of everything Google. And he remembers the switch from other search engines to Google in the late 1990s. It was passed around by word of mouth as a better alternative to AltaVista, which wasn’t the biggest search engine of the era but was considered the best one yet. In 2023, search optimization is a matter of sheer self-interest, a necessity of life in a Google-dominated world. The URLs of new articles are loaded with keywords. YouTube video titles, too — not too many, of course, because an overly long title gets cut off. Shop listings by vendors sprawl into wordy repetition, like side sign spinners reimagined as content sludge. And it goes beyond just Google’s domain. Solid blocks of blue hashtags and account tags trail at the end of influencer Instagram posts. Even teenagers tag their TikToks with #fyp — a hashtag thought to make it more likely for videos to be gently bumped into the algorithmic feeds of strangers. The word SEO “kind of sounds like spam when you say it today,” said Lenssen, in a slightly affected voice. “But that was not how it started.” To use the language of today, Lenssen and his cohort of bloggers were the earliest content creators. Their tastes and sensibilities would inflect much of digital media today, from Wordle to food Instagram. It might seem unfathomable now, but unlike the creators of 2023, the bloggers of the early 2000s weren’t in a low-grade war with algorithms. By optimizing for PageRank, they were helping Google by making it better. And that was good for everyone because making Google better was good for the internet. This attitude is easier to comprehend when you look back at Google’s product launches in these early years — Google Groups, Google Calendar, Google News, Google Answers. The company also acquired Blogger in 2003. “Everything was done really intelligently, very clean, very easy to use, and extremely sophisticated,” said technologist Andy Baio, who still blogs at Waxy.org. “And I think that Google Reader was probably the best, like one of the best, shining examples of that.” “Everybody I knew was living off Google Reader,” recalled Scott Beale of Laughing Squid. Google Reader was created by engineer Chris Wetherell in 2005. It allowed users to take the RSS feeds — an open protocol for organizing a website’s content and updates — and add those feeds into a singular reader. If Google Search was the spinal cord of 2000s internet culture, Google Reader was the central nervous system. “They were encouraging people to write on the web,” said Baio. Bloggers like Lenssen, Baio, and Beale felt like everything Google was doing was in service of making the internet better. The tools it kept launching felt tied to a mission of collecting the world’s information and helping people add more content to the web. Lenssen said he now sees SEO as more or less part of the same nefarious tradition as Google bombing Many of these bloggers feel differently now. Lenssen said he now sees SEO as more or less part of the same nefarious tradition as Google bombing. “You want a certain opinion to be in the number one spot, not as a meme but to influence people,” he said. Most of the other bloggers expressed a similar change of heart in interviews for this piece. “When Google came along, they were ad-free with actually relevant results in a minimalistic kind of design,” Lenssen said. “If we fast-forward to now, it’s kind of inverted now. The results are kind of spammy and keyword-built and SEO stuff. And so it might be hard to understand for people looking at Google now how useful it was back then.” But there is one notable holdout among these early web pioneers: Danny Sullivan, who, during this period, became the world’s de facto expert on all things search. (Which, after the dawn of the millennium, increasingly just became Google Search.) Sullivan’s expertise gives his opinion some weight, though there is one teeny little wrinkle — since 2017, he’s been an employee of Google, working as the company’s official search liaison. Which means even if he doesn’t think they are, his opinions about search now have to be in line with Google’s opinions about search. According to Sullivan, the pattern of optimizing for search predates Google — it wasn’t the first search engine, after all. As early as 1997, people were creating “doorway pages” — pages full of keywords meant to trick web crawlers into overindexing a site. More crucially, Sullivan sees Google Search not as a driver of virality but as a mere echo. “I just can’t think of something that I did as a Google search that caused everybody else to do the same Google search,” Sullivan said. “I can see that something’s become a meme in some way. And sometimes, it could even be a meme on Google Search, like, you know, the Doodles we do. People will say, ‘Now you got to go search for this; you’ve got to go see it or whatever.’ But search itself doesn’t tend to cause the virality.” Those hundreds of millions of websites jockeying for placement on the first page of results don’t influence how culture works, as Sullivan sees it. For him, Google Search activity does not create more search activity. Decades may have passed, but people are essentially still searching for “Jennifer Lopez dress.” Culture motivates what goes into the search box, and it’s a one-way street. But causality is both hard to prove and disprove. The same set of facts that leads Sullivan to discount the effect of Google on culture can just as readily point to the opposite conclusion. That same month, what is largely considered to be the first real internet meme, “All Your Base Are Belong To Us,” was launched into the mainstream In February 2001, right after Hugedisk’s Google bomb, Google launched Google Groups, a discussion platform that integrated with the internet’s first real social network, Usenet. And that same month, what is largely considered to be the first real internet meme, “All Your Base Are Belong To Us,” was launched into the mainstream after years of bouncing around as a message board inside joke. It became one of the largest search trends on Google, and an archived Google Zeitgeist report even lists the infamous mistranslated video game cutscene as one of the top searches in February 2001. Per Sullivan’s logic, Google Groups added better discovery to both Usenet and the myriad other message boards and online communities creating proto-meme culture at the time. And that discoverability created word-of-mouth interest, which led to search interest. The uptick in searches merely reflected what was happening outside of Google. But you can just as easily conclude that Google — in the form of Search and Groups — drove the virality of “All Your Base Are Belong To Us.” “All Your Base Are Belong To Us” had been floating around message boards as an animated GIF as early as 1998. But after Google went live, it began mutating the way modern memes do. A fan project launched to redub the game, the meme got a page on Newgrounds, and most importantly, the first Photoshops of the meme showed up in a Something Awful thread. (Consider how much harder it would have been, pre-Google, to find the assets for “All Your Base Are Belong To Us” in order to remix them.) That back and forth between social and search would create pathways for, and then supercharge, an online network of independent publishers that we now call the blogosphere. Google’s backlink algorithm gave a new level of influence to online curation. The spread of “All Your Base Are Belong To Us” — from message boards, to search, to aggregators and blogs — set the stage for, well, how everything has worked ever since. SEO experts like Sullivan might rankle at the idea that Google’s PageRank is a social algorithm, but it’s not not a social mechanism. We tend to think of “search” and “social” as competing ideas. The history of the internet between the 2000s and the 2010s is often painted as a shift from search engines to social networks. But PageRank does measure online discussion, in a sense — and it also influences how discussion flows. And just like the algorithms that would eventually dominate platforms like Facebook years later, PageRank has a profound effect on how people create content. Alex Turvy, a sociologist specializing in digital culture, said it’s hard to map our current understanding of virality and platform optimization to the earliest days of Google, but there are definitely similarities. “I think that the celebrity gossip world is a good example,” he said. “Folks that understood backlinks and keywords earlier than others and were able to get low-quality content pretty high on search results pages.” He cited examples such as Perez Hilton and the blogs Crazy Days and Nights and Oh No They Didn’t! Over the next few years, the web began to fill with aggregators like eBaum’s World, Digg, and CollegeHumor. But even the creators of original high-quality content were not immune to the pressures of Google Search. Deb Perelman is considered one of the earliest food bloggers and is certainly one of the few who’s still at it. She started blogging about food in 2003. Her site, Smitten Kitchen, was launched in 2006 and has since spawned three books. In the beginning, she says, she didn’t really think much about search. But eventually, she, like the other eminent bloggers of the period, took notice. “It was definitely something you were aware of — your page ranking — just because it affected whether people could find your stuff through Google,” she said. It’s hard to find another sector more thoroughly molded by the pressures of SEO than recipe sites It’s hard to find another sector more thoroughly molded by the pressures of SEO than recipe sites, which, these days, take a near-uniform shape as an extremely long anecdote (often interspersed with ads), culminating in a recipe card that is remarkably terse in comparison. The formatting and style of food bloggers has generated endless discourse for years. The reason why food blogs look like that, according to Perelman, is pretty straightforward: the bloggers want to be read on Google. That said, she’s adamant that most of the backlash against food bloggers attaching long personal essays to the top of their posts is obnoxious and sexist. People can just not read it if they don’t want to. But she also acknowledged writers are caving to formatting pressures. (There are countless guides instructing that writers use a specific amount of sentences per paragraph and a specific amount of paragraphs per post to rank better on Google.) “Rather than writing because there was maybe a story to tell, there was this idea that it was good for SEO,” she said. “And I think that that’s a less quality experience. And yeah, you could directly say I guess that Google has sort of created that in a way.” Sullivan says PageRank’s algorithm is a lot simpler than most people assume it is. At the beginning, most of the tips and tricks people were sharing were largely pointless for SEO. The subject of SEO is still rife with superstition. There are a lot of different ideas that people have about exactly how to get a prominent spot on Google’s results, Sullivan acknowledges. But most of the stuff you’ll find by, well, googling “SEO tricks” isn’t very accurate. And here is where you get into the circular nature of his argument against Google’s influence. Thousands of food bloggers are searching for advice on how to optimize their blogs for Google. The advice that sits at the top of Google is bad, but they’re using it anyway, and now, their blogs all look the same. Isn’t that, in a sense, Google shaping how content is made? “All Your Base Are Belong To Us” existed pre-Google but suddenly rose in prominence as the search engine flickered on. Other forms of content began following the same virality curve, rocketing to the top of Google and then into greater pop culture. Perelman said that one of the first viral recipes she remembers from that era was a 2006 New York Times tutorial on how to make no-knead bread by Sullivan Street Bakery’s Jim Lahey. “That was a really big moment,” she said. True to form, Sullivan doubts that it was search, itself, that made it go viral. “It almost certainly wasn’t hot because search made it hot. Something else made it hot and then everybody went to search for it,” he said. (Which may be true. But the video tutorial was also published on YouTube one month after the site was purchased by Google.) The viral no-knead bread recipe is a perfect example of how hard it can be to separate the discoverability Google brought to the internet from the influence of that discoverability. And it was even harder 20 years ago, long before we had concepts like “viral” or “influencer.” Alice Marwick, a communications professor and author of The Private Is Political: Networked Privacy and Social Media , told The Verge that it wasn’t until Myspace launched in 2003 that we started to even develop the idea of internet fame. “There wasn’t like a pipeline for virality in the way that it is,” she said. “Now, there is a template of, like, weird people doing weird stuff on the internet.” “Google has gotten shittier and shittier.” Marwick said that within the internet landscape of the 2000s, Google was the thing that sat on top of everything else. There was a sense that as anarchic and chaotic as the early social web was out in the digital wilderness, what Google surfaced denoted a certain level of quality. But if that last 25 years of Google’s history could be boiled down to a battle against the Google bomb, it is now starting to feel that the search engine is finally losing pace with the hijackers. Or as Marwick put it, “Google has gotten shittier and shittier.” “To me, it just continues the transformation of the internet into this shitty mall,” Marwick said. “A dead mall that’s just filled with the shady sort of stores you don’t want to go to.” The question, of course, is when did it all go wrong? How did a site that captured the imagination of the internet and fundamentally changed the way we communicate turn into a burned-out Walmart at the edge of town? Well, if you ask Anil Dash, it was all the way back in 2003 — when the company turned on its AdSense program. “Prior to 2003–2004, you could have an open comment box on the internet. And nobody would pretty much type in it unless they wanted to leave a comment. No authentication. Nothing. And the reason why was because who the fuck cares what you comment on there. And then instantly, overnight, what happened?” Dash said. “Every single comment thread on the internet was instantly spammed. And it happened overnight.” Dash is one of the web’s earliest bloggers. In 2004, he won a competition Google held to google-bomb itself with the made-up term “nigritude ultramarine.” Since then, Dash has written extensively over the years on the impact platform optimization has had on the way the internet works. As he sees it, Google’s advertising tools gave links a monetary value, killing anything organic on the platform. From that moment forward, Google cared more about the health of its own network than the health of the wider internet. “At that point it was really clear where the next 20 years were going to go,” he said. “At that point it was really clear where the next 20 years were going to go.” Google Answers closed in 2006. Google Reader shut down in 2013, taking with it the last vestiges of the blogosphere. Search inside of Google Groups has repeatedly broken over the years. Blogger still works, but without Google Reader as a hub for aggregating it, most publishers started making native content on platforms like Facebook and Instagram and, more recently, TikTok. Discoverability of the open web has suffered. Pinterest has been accused of eating Google Image Search results. And the recent protests over third-party API access at Reddit revealed how popular Google has become as a search engine not for Google’s results but for Reddit content. Google’s place in the hierarchy of Big Tech is slipping enough that some are even admitting that Apple Maps is worth giving another chance, something unthinkable even a few years ago. On top of it all, OpenAI’s massively successful ChatGPT has dragged Google into a race against Microsoft to build a completely different kind of search, one that uses a chatbot interface supported by generative AI. Twenty-five years ago, at the dawn of a different internet age, another search engine began to struggle with similar issues. It was considered the top of the heap, praised for its sophisticated technology, and then suddenly faced an existential threat. A young company created a new way of finding content. Instead of trying to make its core product better, fixing the issues its users had, the company, instead, became more of a portal, weighted down by bloated services that worked less and less well. The company’s CEO admitted in 2002 that it “tried to become a portal too late in the game, and lost focus” and told Wired at the time that it was going to try and double back and focus on search again. But it never regained the lead. That company was AltaVista."
},
{
"score": 0.18229585886001587,
"title": "Our intent to use \"Search intent\"",
"id": "https://magazine.joomla.org/all-issues/june-2022/our-intent-to-use-search-intent",
"url": "https://magazine.joomla.org/all-issues/june-2022/our-intent-to-use-search-intent",
"publishedDate": "2022-06-20",
"author": "Philip Walton",
"text": "Google has a major influence on search. About 78% of all searches on desktop and mobile are using Google. Bing has the second spot with 10.5% and Baidu the third place with 4.25%\nAround 93% of all web traffic is via search, so if you want your articles read, then you need to get them into a search engine and in front of the relevant audience, but how? \nThe how has changed over the years with crude spamming techniques and black hat ways becoming less and less fruitful as the major search engines have tuned their algorithms to bring back more and more refined results.\nBut what is the aim of the search engines, and what are they tuning the algorithms towards?\nClearly, the likes of Google and Bing are not charities with altruism the only consideration before their shareholders and board members, they are interested in a return on their investments. \nThey want us to see adverts, to see offerings that meet our search terms. Yes, they need to offer us pages that make sense. The adverts, as well as the organic search results, need to be relevant; otherwise, we will quickly lose interest in the search engine and switch loyalty. \nHow long the content will keep visitors engaged and looking at more pages, more chances of that advert catching the eye all adds up.\nSo how can they keep us true and ensure that we keep “googling” or “binging”?\nContent is King?\nFor many years the phrase that was often trotted out as a truism was that content is king. Write good content, and you will get good search results. It was not so much the system or the user interface, slow loading sites were the standard, and we, the viewerS, just had to wait to see what was being offered, patience.\nThere is a second part to that: what constitutes good content? Is it a style of writing that captures the imagination? Fact-based sterile and dry but with lots of tables and statistics?\nIs it getting celebrities and the famous writing the articles (even if ghosted)?\nAll of these approaches have been tried to varying degrees of success over the years, and as they come and go, the algorithms and the ideas behind the algorithms have changed, adapted and evolved.\nYes, content is still king, but the emphasis on whom the content is for has switched.\nI have pondered on ways to explain this, and the idea came to me the other day while sitting in my local pub, The Compass Ale House in Gravesend.\nIt's a small micro pub with no bar and an ever-changing selection of 6 beers on offer via cask or keg.\nThe pub still is the heart of the community in many English villages and towns, the place to meet friends, chill and plan local events.\nSome nights the conversation flows, you get an exchange of ideas, it's inspiring, and you come away thinking a good night was had by all.\nBut other times, you can be caught by the pub bore, the sort of person that just wants to talk at you. They are unloading their tales from their life, again and again, and you are the unfortunate victim that is caught in the corner and is being talked at and not to.\nSearch intent, what am I wanting to read?\nAnd it occurred to me while sitting drinking half of a particularly malty stout that that is similar to the shift in search over the years.\nWe have often written articles that are our ideas, our story and we write them from our point of view, uninterested in dialogue and hearing the other side's point of view. A monologue is designed to be entertaining and insightful, but a monologue nevertheless.\nSearch intent and the concept behind it is moving away from the pub bore way of doing things and is more about a dialogue, a way of writing that compliments the points that the other person is making, and a way to talk and listen at the same time, to engage and offer insightful views and facts mindful of the audience listening.\nHow can a web page listen?\nWell back to the pub and its offering. Mine is a rowing pub as it's near to the River Thames, so the walls have charts of the river. The owner rows and can engage in conversation that is in that niche. Although there are different themes to the evenings, games night, and tastings, they will reflect the pub audience, and the audience then reflects the theme of the pub, they compliment each other. So the pilots and river workers say, “that's a great place to chat and relax for an evening”.\nIf randomly the landlord put up television and decided to play football, drowning out the chat and banter, people would go elsewhere, they would stop attracting those on the theme of the river, but it would be less attractive to those that like to watch football than a regular sports pub which could be relied upon to show all the top games.\nIt's all about niching down and listening to your audience, seeing what they need and building upon it.\nBut how do I know what my audience wants to hear?\nA fair question and one that requires some tools and knowledge.\nSearch intent, as the words imply, amounts to knowing the intent of the person searching.\nHow can we know what they are searching for and whether there are lots of articles already covering such topics? With tools and apps, we can get such information in a reliable and usable way. The tools will let us then construct the articles that “talk” to the audience but also allow us to listen in on our audience and see what they respond to well.\nI use such tools when writing articles for clients, and I have tested them on my test sites to see if they work, and they do.\nThe tools in question are Semrush, ahrefs, Moz and Ubersuggest. There are others, but I have not used those directly, so I cannot comment.\nSome will allow a free trial or limited use for free. With Ubersuggest, you can do all needed for free, it's just the number of reports and size of samples.\nSo what's the process, how do I get a dialogue rather than a monologue? \nHow do I listen to my audience and engage with them?\nFirst, I plot a rough outline of the article so, for example, say I am writing an article about drinking water. \nRather than just write my thoughts and some statistics, I would use the tools at hand to find what topics are searched for, the volume and the difficulty.\n\nKnowing the SEO difficulty or SD score allows us to see if we have a chance to get near the front of google for the terms used. The SD ranges from 1 to 100. It is allied to another term, “Domain Authority'', which is the score each website has on the web, its pecking order against other sites. So a site such as a brand spanking new construction website unleashed on the internet will have a score of under 5, and something very big in terms of traffic, backlinks and articles, such as joomla.org, will have an SD in the range of 90-100.\n\nNotice our keyword term was “why is water important”.\nHad we had a much shorter keyword term such as “water”, the SD is much higher and is more off-topic from what we are probably wanting.\n\nSo, now with such tools, I can take my idea for an article, research the people out there who are searching for some specific terms and tune my article to feature those terms as h2 subtitles.\nStick to the topic of the subtitles in the article and make sure they are all in a coherent niche, so they hang together.\nIt's as if I had overheard them in the pub saying that they wanted a conversation regarding a specific subject, and low and behold, I walk up and start chatting about exactly that topic. They are going to be much more receptive to the article. Also, as I have set the expectations for Google by tailoring the SD to the ranking of my site, Google is likely to rank me near the front as a search term on one of the h2 sub-topics, it will, if well constructed, rank me high for multiple h2 search terms and with each bring in a sizable amount of traffic.\nNow we have a virtuous circle, as I write on a niche with well-researched articles tailored to a receptive audience and pitched with long terms that are in my reach SD-wise. I get rewarded with backlinks for being authoritative, and I also have a growing domain ranking as long as all my articles stay in that niche. Link internally between the articles to help grow more of an engaged audience, and soon I can try for the higher SD and larger volume terms.\nKnowing my audience and listening to what they want to hear, then writing to that audience, so their search intent is satisfied by my articles, rather than just talking to them about what I want to say, is the key to success and growth.\nNext time I want to share some insights about the Joomla websites, their Domain Authority and how we can use this and magazine articles to draw in a wider audience to Joomla."
},
{
"score": 0.1811932474374771,
"title": "Is SharePoint Search Dead? “- by Jeff Fried, CTO of BA Insight - Search Explained",
"id": "https://searchexplained.com/is-sharepoint-search-dead/",
"url": "https://searchexplained.com/is-sharepoint-search-dead/",
"publishedDate": "2017-06-07",
"author": "Jeff Fried",
"text": "I’m asked regularly whether Microsoft has abandoned the Enterprise Search market. This was a frequent question in 2015, and less frequent in 2016, but there’s been a recent uptick, and I got this question 10 times last month. As a long-standing search nerd that lives close to Microsoft, I know the answer is NO. But I was baffled about why this question keeps coming up.\nSo I decided to investigate. This blog takes you through what I’ve found and how you can answer the question when it comes up. Search Explained is the perfect place to publish it.\nWhy do people presume Microsoft is out of the Enterprise Search Market?\n “Reports of my death were greatly exaggerated” – a quip attributed to Mark Twain (actually a misquotation) – has become a popular culture trope found in movies, song lyrics, and tweets. When this happens in real life, it’s usually a matter of perception and miscommunication. That’s definitely the case with SharePoint Search. \nOverall market perceptions change slowly, so the roots of this question go back over a few years. When I went through it, there are 6 reasons people presume Microsoft is out of the market:\n 1) Changing Marketing and Terminology – the market as a whole, and Microsoft in particular, has a tendancy to rebrand things or change they way they talk about them, which can be confusing.\nMarketers and analysts in the industry have been uneasy about the word “search” for a long time, and there have been periodic attempts to redefine it or replace it. “Search” has several weaknesses as a term: it’s primarily used as a verb (which is weak), it denotes an incomplete activity that’s not useful in and of itself (seeking something rather than locating it), and it has accreted some baggage over the years, including the expectations that enterprise search is just like web search. People instead have used ‘find’ and ‘findability’; ‘discovery’ and/or ‘exploration’; ‘question answering’ and different variants of ‘analytics’. The enterprise search field has been called ‘information retrieval’; ‘unified information access’; and recently, ‘cognitive search’ and ‘insight engines’.\nThere’s a place and role for all these terms, but it’s confusing. Meanwhile, ‘search’ continues to be widely used and none of the alternative names have really displaced it. The term search just refuses to go away.\nWhen Microsoft stopped talking much about Enterprise Search and started talking more about Delve and Discovery, it contributed to a sense that Microsoft wasn’t doing search any more.\n 2) Long, confusing product evolution – within Microsoft, enterprise search has been through a number of transitions, not the least of which was the acquisition of FAST back in 2008. I drew out a picture (see below) about all the transitions – the dotted blue lines denote a major re-architecture, which you can see has happened a lot.\nWhen I co-wrote “Professional Microsoft Search” about the 2010 generation, there were 7 variants in the product lineup. When the 2013 generation came out, fully integrating what had been a next-gen technology project from FAST, it changed search dramatically. SharePoint Search is Dead, Long Live SharePoint Search was the title of a TechEd session by my friend Neil Hodgkinson at that time – touting how completely the new search had been revamped with SharePoint 2013.\nThere’s been plenty of name changes here too, contributing to the confusion/perception. The FAST brand was dropped in 2013 (though it was partially resurrected in a variety of Microsoft material). I have heard people say that Microsoft acquired FAST then dropped it after a couple of years! This is the complete opposite of the truth, but contributes to the rumors of SharePoint Search’s demise.\n 3) Bundling – along with the 2013 generation, Microsoft dropped their Search Server products. This was to simplify the product lineup. After all, if you wanted a great standalone search product, you could just use a few SharePoint servers, configure them with search services, and not use anything else. SharePoint still had a very strong search capability. In fact, the free version (SharePoint Foundation) became a very capable search product, up to 1M items.\nHowever, the ‘Search Server’ name was gone, another easy-to-misinterpret signal. Plus, there was now a genuine constraint: to use SharePoint as a standalone search product, you still needed to be licensed for SharePoint. If you already own SharePoint, the price of using a few more servers to create a search farm is very small. But if you don’t already own SharePoint, it usually does not make financial sense to bring it in for the sole purpose of search.\n\n4) Gartner Coverage – among analysts covering the search market, Gartner is the best known. In 2013, when Gartner published their Magic Quadrant on Enterprise Search, Microsoft was an upper-right market leader. But in 2014 Microsoft was gone. In 2015 Microsoft was also not included. Why? The criteria for inclusion was to have a standalone search product, and Microsoft did not qualify because of their bundling (neither did Oracle or OpenText). Gartner has a big influence (probably way too much), and this was a signal to many that Microsoft was out of the market.\n 5) Partner Strategy – in search, as in many other areas, Microsoft uses partners to fill out solutions, more so than any other major software vendor. This partner ecosystem is a big advantage in many ways. But there are many areas left to partners, sometimes referred to as “white space”. Microsoft Enterprise Search has a lot of white space – things like connectors, autoclassification, and analytics are provided by partners. A company looking for a complete enterprise-scale search solution cannot really get it straight out-of-the-box with SharePoint search; they have to either do extensions themselves or get add-ons from Microsoft partners. There’s a perception that this means Microsoft is not fully engaged in the search market, since they don’t address all the requirements.\n 6) Development focus – after the release of SharePoint 2013, the search development team within Microsoft turned to a new, innovative area: the creation of the Office Graph (initially called “project Oslo” because that’s where most of the development team is based). Bug fixes for search were slower, and new features for search were less frequent. SharePoint 2016 search is essentially the same as SharePoint 2013 search if you use it on-prem. Even people quite close to the development team expressed frustration with the lack of focus on traditional search issues. That also added to the sense that Microsoft had moved on.\nWhen you look at these six areas, it seems natural that people might assume Microsoft had exited the market. Four of these are purely perception, but two have some reality as well. Bundling search into SharePoint truly did make Microsoft search a non-starter for some companies, and Microsoft’s development focus actually did shift away from traditional on-prem search towards the cloud.\nWhat’s now clear to me is that most of these factors came to the fore in 2014 and 2015. Certainly, 2015 was the year I got the most questions on this topic. Ironically, this was shortly after the rollout of SharePoint 2013, which was from Microsoft’s point of view a great success cementing their leadership of the search market. I find that Microsoft has a “mission accomplished” mentality – when think they’ve won the momentum on the current battlefront, they shift resources and focus almost entirely to a new battlefront. This is what happened in search – focus shifted fully to the Office Graph initiative, leaving the market with the impression that Microsoft had left the room.\nSee the second part of this blog post here: “SharePoint Search is still alive and well”\n Learn More\nSubscribe to our Search newsletter to get notofied about new content, courses and events here.\n100% knowledge. 0% spam. Reader Interactions"
},
{
"score": 0.18116018176078796,
"title": "A brief evolution of Search: out of the search box and into our lives",
"id": "https://searchengineland.com/brief-evolution-search-search-box-lives-252622",
"url": "https://searchengineland.com/brief-evolution-search-search-box-lives-252622",
"publishedDate": "2016-06-27",
"author": "Sponsored Content Bing Ads",
"text": "We live in a mobile-first, cloud-first world powered by technology that changes by the nanosecond. And search is no different. Search is changing in look, form and function to become part of the fabric of our everyday lives, barely recognizable from its inception as a text box.\nBy 2020, 50 percent of searches will be voice searches (ComScore). The Amazon Echo, a voice-activated home automation hub, was the top-selling speaker in 2015 (KPCB Internet Trends 2016), getting over 25,000 five-star reviews on Amazon, signaling a real shift in how we conduct searches. As we enter a new era of personalized search, I’d like to take a moment to look back at where we came from. A nod to our past, if you will, as we spiral toward a knowledge-drenched future built upon the very fabric of search.\nThe early days\nI refer to the early days of search as Query Search. Some search pioneers may even remember Archie, considered to be the very first search engine, which launched over 25 years ago. Early query searches had to match the exact wording of a website’s title in order to appear, as search bots only scanned titles. Imagine the frustration of trying to find a website, playing hit or miss across an ocean of content. Engines quickly evolved to index entire pages and return a broader array of results.\nOver the years, many different engines appeared, weighing in with various technologies and pushing search forward with faster listings and smarter indexing. And then, in 2000, something very interesting happened. The pay-per-click (PPC) model arrived on the scene. Suddenly, a searcher’s everyday quest for knowledge became an advertising channel the likes of which we’d never seen. Almost overnight, “being found” on the internet turned into a commodity — and a highly valuable one at that. Engines scrambled to fine-tune the PPC model, developing self-service interfaces so that advertisers could manage their own campaigns, and a new field of marketing was born — search.\nThe 6 eras of search\nThe last 20 years of search can be broken down into six defining eras: Query, Demographic, Mobile, Voice, Personal and ultimately, Intelligent.\nDemographic search quickly grew out of a need to qualify searches, leading to day parting and language targeting to help advertisers zone in on specific markets.\nMobile search started picking up steam in the mid-to-late 2000s, with marketers touting that each subsequent year would be the “year of mobile;” however, the explosion of the mobile era didn’t begin until around 2012, and we’ve now seen mobile outpace desktop search growth. This explosion has led to device bidding and location targeting, with searchers expecting hyper-relevant results based on their location.\nAnd finally, with the advent of natural language searches, the era of voice arrived, and search officially jumped out of the text box and into our lives. Bing currently powers the voices of Cortana (Windows 10), Siri (iOS) and Alexa (Amazon Echo).\nThis, of course, is only the tip of the iceberg. Search is no longer just about finding information. It’s part of our everyday life, an assumed presence, and there is no going back to life without search. We’re entering an era where search is personal, predictive and actionable. It’s not only on our computers, but it’s on our phones. In our homes. In cars. In gaming systems. We can shop, book travel, make reservations — all directly within the results pages. Search engines are getting more and more intelligent, delivering contextual results based on location, trends and historical data.\nSo the next time you call out to your Amazon Echo, “Order me a pizza,” take in a brief moment of awe at how far we’ve come so quickly and what amazing things lie ahead.\nCheck out other great content at Bing."
},
{
"score": 0.1807292103767395,
"title": "Why search sucks",
"id": "https://www.businessinsider.com/search-web-email-google-streaming-online-shopping-broken-2022-4",
"url": "https://www.businessinsider.com/search-web-email-google-streaming-online-shopping-broken-2022-4",
"publishedDate": "2022-04-24",
"author": "Adam Rogers",
"text": "I needed to find an old email the other day. That's it. Simple. But I dreaded the prospect, because I use Gmail. And the search function in Gmail, as millions of users know from bitter personal experience, makes it almost impossible to find what you're looking for. You hunt for a flight confirmation number; you get every newsletter from the frequent-flyer program. Search by sender's name, and you get only the most recent few days of emails from them — if you get anything relevant at all. Search for an attachment, and you can't tell which message actually has the attachment or which ones are just replies. I'd laugh out loud if I didn't have a headache from banging my head against my desk. How is it possible that the company that makes the best, most robust technology for searching the internet also makes an email product in which the search function doesn't actually function? But the truth is, Gmail isn't an outlier when it comes to search. Apple's Spotlight search often coughs up no results for specific documents or files I need; search in the Finder finds too many. The screens of Google Maps and Apple Maps are too cluttered with functions to see, especially on mobile. Amazon literally shows you things you did not ask for, followed by its own knockoffs, before taking a stab at locating what you typed into the search bar. Instagram doesn't have image search. Searching for a specific tweet you remember, even by the handle of the tweeter? Good luck. So my question is this: Why is search so bad? Solving how to search for things was the key to the web's integration into mainstream life, the thing that moved the internet out of university basements and into our pockets. Now, it seems as if our ability to locate and retrieve information is getting worse instead of better, right at the moment when true facts are humanity's most precious commodity. When we moved into the digital age, we made a collective decision to store almost everything we know — even our most personal and intimate memories — outside our brains. At this point, search is memory. And when we all use the same slightly broken tools for recall, we're at risk of forgetting ourselves. To understand how we got from there to here — from our neatly organized past to the hopelessly cluttered present — we need to understand that search comes in different flavors. Different kinds of information require different kinds of searches. But every form of search has one thing in common. To put it in technical terms, they all kind of suck.\n\n﹒\n\nBroadly speaking, there are two kinds of search. One is \"known item,\" where you have a specific fact or object or destination in mind, and you just need to know where it is. The other is \"exploratory,\" where you don't know what you don't know. Email is a sort of special case. People often know a specific email exists, or who sent it, or when, but those criteria might also fit a bunch of other emails. People mostly want to find emails that are recent, from within the past month or so — except sometimes they don't. People often remember a few key details, like who sent an email or what some of the words were, but sometimes they misremember. \"The reason why search provided by individual providers, including Gmail, is often quite awful is that the underlying problem is quite hard,\" says Sridhar Ramaswamy, a former Google executive who is now the CEO of Neeva, a search-engine startup. When Google first arrived, it solved the search problem in several ingenious ways. The famous one, the one you've probably heard of, is called PageRank, which counted \"inbound links\" — i.e., the number of times other sites cited the same result as a source. PageRank gave you the answer that the rest of the internet thought was a good answer. But Google's deeper power is in identifying what those pages are about and being able to associate the kinds of words you and I might search for with the stuff on those pages. And then there's the index — Google's regularly updated crawl of the entire internet, or a vast majority of it. Today the index surveys 100 million gigabytes of data, hundreds of millions of webpages. That scale gives Google a huge advantage: statistics on all the different things people search for, and the ways they do it. Alas, most of that advantage evaporates in Gmail. Yes, there's a lot of email in the world — according to a book on the problem by a bunch of Gmail engineers, people receive over 300 billion emails every day, if you count machine-generated stuff like receipts or notifications. That sounds like a big enough corpus of data to make Google-scale stats work. But email isn't a collective thing like the web. Your email inbox is yours, and no matter how much email you have in there (read or unread, I don't judge), it's not enough to let a Google-type search engine function properly.\n\n.\n\n\"The algorithms that Google uses to search news are not necessarily going to be effective,\" Ian Ruthven, an information scientist at the University of Strathclyde, told me. \"Even though it's huge to you, it's tiny, and the statistics don't work as well.\" Plus, while Gmail is happy to help advertisers target you based on your email behavior, it doesn't collect or share information about how people search their email. It'd be a privacy violation if it did. That means software engineers trying to build email search capabilities can't easily draw on statistical commonalities. They can't learn from the crowd. They have to rely on survey data, or anonymized usage data, or giant, stored repositories of email from dead companies. One of the biggest research archives turns out to be all the email sent inside Enron, the disgraced energy-arbitrage company. \"In web search, you have a collection of documents, the webpages, and that is shared for all users. If you search for something and click on a result, and then if I search for the same thing, Google can use your data, the clicks, to improve my search,\" says Hamed Zamani, the associate director of the Center for Intelligent Information Retrieval at UMass Amherst. \"Email, you have your collection of email, I have my own. The transfer of knowledge between clicks, or any feedback that Google gets from users, cannot be shared.\" Basically, email search is a massive coding problem spread out across millions of users. Trying to locate an email, ironically, may be the most solitary activity in the digital age — the only moment when we're truly alone with our data.\n\n﹒\n\nMost websites, especially startups, don't have the money or know-how to build their own search function. You can click the magnifying glass on a news site's homepage, but it's likely to cough up irrelevant articles, or ones that don't reach back far enough in time. Same thing if you try to search social media: You'll get pointed to lots of specific uses of your query words, but not necessarily from the user you actually want. And if the site has a more calibrated \"advanced search\" option, good luck finding it. Google's ubiquity has led us to assume that every horizontal box with a little magnifying glass on the side will function like a Google search. But they don't. Internet giants like Amazon or Facebook spend lots of time and money on search functions, but smaller organizations can't, or don't. Many use off-the-shelf search software — products like Elastic or Apache Lucene — and customize it a little. They're solid products, but they don't have the advantages of scale that Google does. And since most people will wind up using Google anyway, creating a custom search function just isn't worth it for most companies. \"It's not the heart of the business,\" says Doug Cutting, a retired search-engine builder who helped invent Lucene. \"They tend not to invest.\" That also means that what Google has trained us to do — type keywords into the search bar, over and over, until we find what we're looking for — won't necessarily work on other sites. \"When people develop these habits and then go somewhere else expecting the system to be just as effective, they're often supremely disappointed,\" says Chirag Shah, an information scientist at the University of Washington. There's a simple solution to the problem: Companies could give Google's bots access to their websites. The algorithm would help customers find what they were looking for. But that would expose a company's internal data — and the habits and behaviors of its users — to a Silicon Valley giant renowned for its ferocious competitive instincts. Letting Google handle your search means letting Google all up in your business — literally. \"Giving away the front door to your product puts you at incredible risk,\" Ramaswamy says. \"Facebook, Instagram, Twitter, Pinterest are exceptionally careful about what they will and will not let Google do. They've all learned there is zero incentive to just giving all their information to Google.\"\n\n﹒\n\nHere's where the problem is perhaps less technical than venal. Not every site wants to show you what you want to find. Let's say you want to buy something. Say you're searching for something on Amazon, which once prided itself on using \"users like you\" recommendation filters and sophisticated ranking of results to display its wares. Today that website will literally show you other things first, followed by its own knockoff products, and then paid advertisements, before deigning to show you what you asked for. After a couple of decades using Google, we're all trained to assume that search results get ranked by relevance to our query. But the fact is, a website trying to sell something will always game its results to its benefit. The catch is, the options for searching a commerce site can't completely suck, because then people won't use it — you'd be forgiven for abandoning Home Depot for McMaster-Carr on this basis alone. A site trying to sell you something has to show just enough of what you want to buy, and just enough of what it wants to sell — that's the green double-zero sweet spot that makes sure the house maintains its win margin. \"I've actually worked at those places, so I know,\" says Shah, the information scientist. \"They have to balance what will increase their profit margin and what will give the user the sense they're getting a good deal and relevant result.\"\n\n﹒\n\nStreamers, run by big content creators like Netflix and Disney, are like most websites — they don't want Google to have access to data that could give away a competitive edge. So a simple Google query doesn't always cough up particularly useful results from them. It's also why the search function on Apple TV yields confusing results at best — because streaming services won't grant Apple access to their data. Why would HBO Max want its customers to get lost in a sea of results using Apple's interface instead of directly searching its own? As for why their own internal search functionality doesn't work well — well, that goes back to gaming the results. Representatives for the streaming services I spoke with emphasized their focus on recommendation algorithms, which show you content based on what they can tell about your preferences from what you've already watched. That's in part because a more straightforward search would show you that their libraries are finite. If you go searching for stuff that isn't there, you might start thinking about subscribing to a different service. So they proactively show you \"The Goonies,\" which they have, before you start looking for \"Gremlins,\" which they don't. Recommendation algorithms based on past behavior are really just search engines where the queries are implicit. Show me other movies like this. That's called a zero-query search, and if you've ever fallen down a YouTube or Instagram rabbit hole, you know the vibe. But in an oblique way, every search we do has a secret, implicit zero-query search embedded in it. We're looking for something that scratches an intellectual or emotional itch — something that makes us feel better in some unarticulated way. That's why algorithmic recommendations are so pernicious. They work! They give us what we want, confirm our suspicions, comfort us, and tell us we were right about what we already thought, even when that's not what we need.\n\n﹒\n\nThere's a reason the company's name is a verb. It's not just the over-90% market share, or the unbeatably massive index, or even the speed with which it responds to queries. Maybe the days of the \"10 blue links\" — when the first page of results was reliably filled with the most relevant places to find the information you were looking for — are over. But it's still the case that, for most searches, Google works. Sure, there's some tension between Google's \"editorial\" product — the search results — and its advertisements. Every few months another article or report (here's one) confirms that more and more Google results are pay-for-play, including plenty of straight-up spam, or grifts. After all, the whole point of SEO — search-engine optimization — is for sites to game their way to the top of Google. One way or another, most results you get on Google are the product of a concerted effort to win your attention. Pandu Nayak, vice president of search at Google, says the ad-edit dynamic is a good-faith one: \"If you talk to the ads team, they're very focused on making sure that ads are actually helpful. Because they realize if they have unhelpful ads, that's a recipe for people to learn to avoid them altogether.\" Nevertheless, he adds, \"search is by no means a solved problem.\" Which, when you think about it, is quite a thing for a VP of search at Google to say. One way to think of search — not just online or digital — is as an attempt to interact with any system of organized information. Whether that's asking a librarian in Alexandria to fetch down a scroll, sending a clerk to thumb through files in a cabinet in the basement, heading into the stacks with a Dewey Decimal number, or typing keywords in the syntax of formal Boolean logic into a LexisNexis terminal at the reference desk, we're forever trying to stare into an abyss filled with information and cajole it into telling us what it knows in a way that both it and we understand. The people who first started thinking about how computers were going to work strongly implied that these new devices would solve both known-item and exploratory search. In 1945, Vannevar Bush, who headed up scientific research for the US government during World War II, said \"associative indexes\" — links, basically — would be the key to a desktop (well, desk-size) information-processing device he called a memex. The first guides to the web, in fact, were literal lists of websites. That's what Yahoo originally was. \"The idea was, you'd try to create a hierarchy of topics you navigate,\" Nayak says. \"It was a great way to organize the web when it was small. But it quickly became infeasible.\" There was just too much internet. Google figured out how to search through that vast bulk so quickly that users could simply do keyword queries over and over, until the right answer showed up. It didn't matter whether you were looking for a known item (Is Michael Caine still alive?) or just exploring (What are Michael Caine's best movies?). We use the same tool for both. But over time, in Google's drive to be big and comprehensive, it got worse and worse at finding stuff that was small or obscure. \"They're trying to serve most of the customers well most of the time,\" says Ruthven, the information scientist. \"If you're part of the fatter part of the tail, you get better results. But if you're doing an unusual search, or you have really unusual taste in music or something, you'll get worse results.\"\n\n.\n\nThe vast majority of the internet is crap, or stuff hardly anyone cares about. Google mostly ignores all that, optimizing to a fraction of its indexed pages. \"Right away, that's a filter,\" says Cutting, the search-engine builder. \"As an optimization, they've just restricted what they're searching.\" And because most people don't want what Cutting calls \"esoteric shit,\" Google winds up favoring the many over the few. \"The feeling that search is dumbed down,\" Cutting says, \"is because search is a mainstream thing now.\" And that's only going to become more of the norm. Google's massive store of data has given it the ability to create software that can actually understand and produce what looks a lot like human speech. This kind of \"large language model\" means Google search interactions may come to look less like an exchange of keywords for links and more like an interaction with a librarian in the old days, a trade of questions and answers. But that will be illusory. Google's algorithms will be able to answer queries in 75 languages, but those answers will still come from parts of the web in Google's index that the company has determined to be \"a high-quality subset.\" The search bar will be easier to use, but the answers won't be more right. It's tough to imagine a technical challenge to Google's hegemony. So many of the brains and so much of the data that might fix search are swiped into the Googleplex. \"If you look at who's got the data,\" Cuttings says, \"who's got access to what people are actually searching for — if you're an academic, you want to get an internship or job at a place like Google, and then get permission to publish, because they have all the resources you need. The cutting-edge work is going to be done at a place like Google or Microsoft or Yandex, and that's unfortunate.\" Still, more than a half-dozen startups are hoping to come at the king. Some offer the ability to customize what and how you search, transparency that Google forgoes in favor of more and more direct answers to questions. Neeva, the Google competitor run by Ramaswamy, promises an ad-free experience that will search both the web and information on your own computer while protecting your privacy — for a subscription fee. Or you could enhance your Google search, as some search experts suggest, by using Google to search Reddit to find out what real humans say about your query. But you'd still be using Google.\n\n﹒\n\nA couple of months ago, a comic-book critic and historian I like tweeted a panel from an old Batman comic. It was a Silver Age meta thing showing Batman sitting in what looks like a library, paging through books and complaining that his editors at DC Comics are nagging him to pick his best stories for a compilation. \"Editors are merciless men,\" Batman says. While I was working on this article, I thought it'd be funny to send that Batman panel to my own editor. But of course I couldn't find the historian's tweet. I used Twitter's basic search function to combine the guy's name with some language I remembered from the tweet. But the tweet still didn't show up. I went to Twitter's advanced search, did the same, added \"editors are merciless men\" and \"Batman,\" and still got nothing. Desperate (well, desperately procrastinating, since I was supposed to be writing my story), I opened up the writer's Twitter feed and started scrolling. Zilch. When I got too far back in time, I stopped, befuddled. On a hunch, I went to Google and typed in everything I remembered about the tweet except the writer's name. And there it was, in the top 10 blue links. Turns out I'd misremembered who tweeted it. It was a different comic-book critic and historian, whom I also like. The problem was not search. It was me. People expect Google or Bing or the magnifying glass on their computer to answer questions like the librarians of old — even complex, open-ended questions that generate complex, often-contradictory answers. \"In some senses, the big search engines have trained us to behave in certain ways: short queries, mainly look superficially at the first page,\" Ruthven says. \"In an exploratory search scenario where you don't know the vocabulary or domain, it's not a good model for interacting with a search system.\" And when a search doesn't produce the answer you're looking for, what's the most human thing to do? Keep asking endless variations of the same question, over and over, until you want to smash something. \"We've seen in some of our studies that people will keep trying the same kinds of queries again and again, hoping it'll yield the right results,\" says Shah, the information scientist. \"They're not willing to change their behavior much.\" Search will always suffer from what we searchers know, or think we know — and what we don't. Our own errant certainty, our mistaking unknown unknowns for known unknowns, puts limits on what we type into a search box. And because Google is pretty good at finding close to what we want from nothing more than a bag of misspelled keywords, we think we're pretty good at searching. Any failures, we assume, must be on the other side of the screen. But search, by necessity, will always involve an interface between human and machine — a relationship, if you will. So how do we fix our troubled interactions with search? Knowing that a healthy relationship is founded on open dialogue, I asked my search partner for suggestions. \"How do I fix our relationship?\" I Googled. \"Face and embrace your differences,\" Google replied. Words to search by. Adam Rogers is a senior correspondent at Insider."
}
],
"origQuery": "Here is an insightful article about the state of search:",
"requestId": "44776fa86010e6bd85b58f439bb3bcfc"
}
We also provide SDKs in Python and JS/TS to abstract these API calls.