Lyrics & Knowledge Personal Pages Record Shop Auction Links Radio & Media Kids Membership Help
The Mudcat Cafesj

Post to this Thread - Sort Descending - Printer Friendly - Home


Help: search engines

kendall 22 May 02 - 07:25 AM
Noreen 22 May 02 - 01:38 PM
Sorcha 22 May 02 - 01:48 PM
MMario 22 May 02 - 01:56 PM
Bill D 22 May 02 - 03:17 PM
Rich_and_Dee 22 May 02 - 03:49 PM
MMario 22 May 02 - 04:00 PM
Jeri 22 May 02 - 04:45 PM
GUEST,Fran 22 May 02 - 05:28 PM
Noreen 23 May 02 - 03:45 AM
Geoff the Duck 23 May 02 - 11:06 AM
kendall 23 May 02 - 11:15 AM
MMario 23 May 02 - 11:17 AM
Stilly River Sage 31 Jan 04 - 01:27 PM
Bill D 31 Jan 04 - 01:54 PM
Rapparee 31 Jan 04 - 02:22 PM
Bill D 31 Jan 04 - 05:00 PM
wysiwyg 31 Jan 04 - 05:05 PM
JohnInKansas 31 Jan 04 - 05:15 PM
GUEST 31 Jan 04 - 05:36 PM
JohnInKansas 31 Jan 04 - 05:52 PM
GUEST 31 Jan 04 - 05:57 PM
Nigel Parsons 31 Jan 04 - 06:02 PM
Bill D 31 Jan 04 - 06:14 PM
JohnInKansas 31 Jan 04 - 06:36 PM
Bill D 31 Jan 04 - 06:54 PM
JohnInKansas 31 Jan 04 - 07:05 PM
Stilly River Sage 31 Jan 04 - 07:26 PM
GUEST 31 Jan 04 - 07:32 PM
Bill D 31 Jan 04 - 08:01 PM
GUEST 31 Jan 04 - 08:11 PM
Bill D 31 Jan 04 - 08:22 PM
GUEST 31 Jan 04 - 08:29 PM
Stilly River Sage 31 Jan 04 - 08:53 PM
Joe Offer 01 Feb 04 - 04:55 PM
Bill D 01 Feb 04 - 11:17 PM
JohnInKansas 02 Feb 04 - 12:33 AM
JohnInKansas 02 Feb 04 - 04:01 AM
Jim Dixon 02 Feb 04 - 01:39 PM
Stilly River Sage 02 Feb 04 - 02:41 PM
JohnInKansas 02 Feb 04 - 07:12 PM
The Fooles Troupe 02 Feb 04 - 10:26 PM
Stilly River Sage 02 Feb 04 - 10:47 PM
The Fooles Troupe 02 Feb 04 - 10:52 PM
The Fooles Troupe 02 Feb 04 - 10:59 PM
Bill D 02 Feb 04 - 11:29 PM
JohnInKansas 03 Feb 04 - 05:30 AM
Stilly River Sage 03 Feb 04 - 10:43 AM
Bill D 03 Feb 04 - 11:24 AM
The Fooles Troupe 03 Feb 04 - 06:19 PM
The Fooles Troupe 03 Feb 04 - 06:37 PM
Burke 03 Feb 04 - 06:45 PM
JohnInKansas 04 Feb 04 - 08:12 AM
Stilly River Sage 04 Feb 04 - 09:21 PM
Bill D 04 Feb 04 - 11:10 PM
JohnInKansas 04 Feb 04 - 11:23 PM
Amos 05 Feb 04 - 12:20 AM
Stilly River Sage 05 Feb 04 - 12:51 AM
JohnInKansas 05 Feb 04 - 01:31 AM
The Fooles Troupe 05 Feb 04 - 08:14 PM
Stilly River Sage 05 Feb 04 - 08:19 PM
Share Thread
more
Lyrics & Knowledge Search [Advanced]
DT  Forum
Sort (Forum) by:relevance date
DT Lyrics:













Subject: search engines
From: kendall
Date: 22 May 02 - 07:25 AM

I was checking out my profile in Google, and, found that they list the old e mail address. I e mailed them and got a terse note saying they only provide information. I replied that we expect accurate information. How does this info get submitted, and who do we go to to update old info?


Post - Top - Home - Printer Friendly - Translate

Subject: RE: Help: search engines
From: Noreen
Date: 22 May 02 - 01:38 PM

Kendall, could you explain what you mean by I was checking out my profile in Google, please? Is it as part of a usenet group, or another site accessed through Google?

(I've only used Google as a search engine, and so don't understand what you mean.)

Would help if I could!

Noreen


Post - Top - Home - Printer Friendly - Translate

Subject: RE: Help: search engines
From: Sorcha
Date: 22 May 02 - 01:48 PM

I don't understand either.......


Post - Top - Home - Printer Friendly - Translate

Subject: RE: Help: search engines
From: MMario
Date: 22 May 02 - 01:56 PM

me neither.


Post - Top - Home - Printer Friendly - Translate

Subject: RE: Help: search engines
From: Bill D
Date: 22 May 02 - 03:17 PM

Google has data based on what it found in some sweep thru the WWW...if it found an old email address somewhere, it will 'eventually' find the new one ...but the old one will stay there as long as some page has not been updated...but it does not provide a 'service' of profiles .

(I can still find a few references to the Digitrad being at Xerox in obscure places that have not updated their links)


Post - Top - Home - Printer Friendly - Translate

Subject: RE: Help: search engines
From: Rich_and_Dee
Date: 22 May 02 - 03:49 PM

Hi,

As I understand, google, list most search engines, runs a piece of software (it used to be called a "spider") that scans as many individual files stored on the web as it can get its hands on, and creates a big index back at the google home site.

When you run an internet search, you're really searching the humungous index. This index is updated when the spider is rerun, which could be daily or once a year.

All google can do is tell you what it found out there. It doesn't make any attempt to verify the data.

Does that kinda answer the question or are we going down the wrong path?

Regards,

Rich


Post - Top - Home - Printer Friendly - Translate

Subject: RE: Help: search engines
From: MMario
Date: 22 May 02 - 04:00 PM

THAT part I understand...but what does he mean by "checking my profile"?


Post - Top - Home - Printer Friendly - Translate

Subject: RE: Help: search engines
From: Jeri
Date: 22 May 02 - 04:45 PM

Sometimes it's hard to figure out what Kendall's talking about. I did manage to find, with a Google search, a link to his profile HERE AT MUDCAT. Kendall, if you send a PM to Pene Azul, he can change it.

There's also an old address at http://www.mainhumor.com/km.htm. The link starts with "Google directory," but Google just links to the above site. There's an e-mail address under Kendall's which is presumably the one to send changes to.


Post - Top - Home - Printer Friendly - Translate

Subject: RE: Help: search engines
From: GUEST,Fran
Date: 22 May 02 - 05:28 PM

Sometimes it's hard to figure out what Kendall's talking about

LOL,Jeri

Mind you I reckon Kendall's meds are working pretty well! Look what he looks like now!

Kendall Morse

PS He's the one on the right....

Love you, K

Fran


Post - Top - Home - Printer Friendly - Translate

Subject: RE: Help: search engines
From: Noreen
Date: 23 May 02 - 03:45 AM

*LOL* Fran!


Post - Top - Home - Printer Friendly - Translate

Subject: RE: Help: search engines
From: Geoff the Duck
Date: 23 May 02 - 11:06 AM

GtD Chuckles to himself......


Post - Top - Home - Printer Friendly - Translate

Subject: RE: Help: search engines
From: kendall
Date: 23 May 02 - 11:15 AM

The meds dont work THAT well! Maybe I used the wrong wording; I went to Google, typed in my name, and everything I ever did came up. Most of the sites they posted have my old e mail address.(cybertours)


Post - Top - Home - Printer Friendly - Translate

Subject: RE: Help: search engines
From: MMario
Date: 23 May 02 - 11:17 AM

Kendall - you would have to individually contact each SITE manager. Google just searches out and lists the links.


Post - Top - Home - Printer Friendly - Translate

Subject: RE: Help: search engines
From: Stilly River Sage
Date: 31 Jan 04 - 01:27 PM

Here's a hoot: a knockoff of Google called Booble:

This is their disclaimer:

    Booble is intended to be a funny parody of the world's largest and best known search engine. The punch line is that unlike popular search sites, Booble actually works for the adult category.

    Anyone who has tried to search adult content using mainstream search engines knows they only lead to confusing porn sites mined with viruses, pop ups, and credit card scams. This is because online pornographers know how to manipulate the system so that their sites are listed first, regardless of quality or value.

    Booble can't be fooled because each of its 6,000 + listings have been edited and classified by hand, not by the computer algorithms used by the major search engines. In addition, Booble's listings often contain pricing information and, where applicable, Booble directs users to site a product reviews. Best of all, Booble is 100% free to consumers.

    Booble was created by a former senior executive of one of America's leading online services, who now lives in New York City with his wife, a French fashion model scandalously younger than he.


I haven't given it a field test, but I hope someone with a little time on their hands will report back!

SRS


Post - Top - Home - Printer Friendly - Translate

Subject: RE: Help: search engines
From: Bill D
Date: 31 Jan 04 - 01:54 PM

"Here's a hoot:"...more like a "hooter", SRS..


I saw this noted somewhere and tried a couple 'searches' in it...POOH! *grin*...cute, but "not ready for prime time"

If one wants to find 'adult resources', clever use of Google, AlltheWeb, AltaVista, etc., is far & away the easiest way to go...with a little intuition and safty precautions enabled ..(anti-virus, anti-popup, etc) there is literally an inexhaustable supply of 'adult' stuff out there. (I have done favors for a couple people to prove this)


Post - Top - Home - Printer Friendly - Translate

Subject: RE: Help: search engines
From: Rapparee
Date: 31 Jan 04 - 02:22 PM

I tried "piety" and got the result that it wasn't to be found....


Post - Top - Home - Printer Friendly - Translate

Subject: RE: Help: search engines
From: Bill D
Date: 31 Jan 04 - 05:00 PM

too many letters, Rapaire! *grin*...'pie' gets a couple.
....(not good ones, though..."choose your perversion" is not my idea of the right attitude!)


Post - Top - Home - Printer Friendly - Translate

Subject: RE: Help: search engines
From: wysiwyg
Date: 31 Jan 04 - 05:05 PM

I'm exhausted just from reading about all the things "I" have apparently done in the world! Good grief!

~S~


Post - Top - Home - Printer Friendly - Translate

Subject: RE: Help: search engines
From: JohnInKansas
Date: 31 Jan 04 - 05:15 PM

So Kendall's "searching my profile" is what some call "the vanity search." I think we had a thread some time back about what people get when they put their own name into Google. Some of our folk were quite amused - some not so much so.

The real clinker in the search engine scam business is that the crawlers that are used to search the web can only read html. Nothing that's out there in database formats will ever be found by most of them, and unfortunately most of the output by "intelligent life forms" (catters excepted?) seems to end up stuffed into some sort of data file. Google, and most others, can only find a link to such stuff if some one talks about it in an html site posting.

Example: DigiTrad is a database. Google can't/won't look inside it. Randomly, someone may post a link to something in the database on an html site, and Google may find that link, but for the most part DigiTrad is "off limits" to most of the popular search engines.

Example: Go to the specialized search engine at ArtCyclopedia and put in the name of your favorite (legitimate) artist. (Try Renoir or Freud if you can't think of one.) You'll get a result showing all the web museums with works by the artist. Pick a work, and do a Google "image" search for the same piece. You will find a link in Google to one of the museums with about the same frequency as you find Google links to DigiTrad. You will find all the poster shops that sell copies of the work, because they post in html, on html sites, but the museums are NOT indexed. Only incidental links to their stuff, when someone comments on an html site, will be found by Google.

On top of the limitation that the crawlers can't read database information, Google appears to stick to their policy of not initiating searches in .org or .edu sites. Thus you get links to stuff on them only when someone talks about someone who talked about something that was talked about … – when the crawler follows a thread of "random postings" that leads it to one of these sites. University library card files are a typical example of a type of resource about as "exempt" from being tapped by Google as the DigiTrad.

The somewhat cynical, but not inaccurate, assessment is that the common search engines don't search the information – they only search the gossip about the "real information" that's out there.

Estimates vary widely as to how much of the web is actually accessible to/through the popular search engines, but I haven't seen a credible estimate that puts it at higher than about 18%. (And I think that's an incredibly optimistic estimate.)

This isn't really a complaint about the search engines. They can be very useful; but their limitations need to be kept in mind when you really need information instead of "gossip."

John


Post - Top - Home - Printer Friendly - Translate

Subject: RE: Help: search engines
From: GUEST
Date: 31 Jan 04 - 05:36 PM

Why don't you go and write a better search engine than Google then, John?

Given your ability to critise it, I assume that you have a better way?


Post - Top - Home - Printer Friendly - Translate

Subject: RE: Help: search engines
From: JohnInKansas
Date: 31 Jan 04 - 05:52 PM

GUEST -

This isn't really a complaint about the search engines. They can be very useful; but their limitations need to be kept in mind when you really need information instead of "gossip."

I don't view a factual description of what the search engines can, and cannot, do as a criticism. It's merely a description, intended to help people understand what resources they have at their disposal and what they might assume they have - that isn't really there.

"Gratuitous snide remark" is a factual description, but doesn't really say anything about how I feel about what was said.
"Nice input" might be opinion, if I said it.

There are other search tools available. Most of them are somewhat more limited in scope than the common engines. Each of them can be discussed in much the same way. Unless the limitations of each is known and kept in mind, the appropriate use of the "best" tool for a given search is unlikely.

John


Post - Top - Home - Printer Friendly - Translate

Subject: RE: Help: search engines
From: GUEST
Date: 31 Jan 04 - 05:57 PM

their limitations need to be kept in mind when you really need information instead of "gossip."


Hardly a 'factual description'


Post - Top - Home - Printer Friendly - Translate

Subject: RE: Help: search engines
From: Nigel Parsons
Date: 31 Jan 04 - 06:02 PM

Just done a 'Vanity Search' on google for my name.
The very first hit is me.
Searching for images instead (also "Nigel Parsons") the 10th & 11th hits are pictures from Miskin 2003 (from those posted here at Mudcat!)

Nigel


Post - Top - Home - Printer Friendly - Translate

Subject: RE: Help: search engines
From: Bill D
Date: 31 Jan 04 - 06:14 PM

what John is referring to above is what is called "The invisible web" or "The deep web" There are ways to locate stuff that Google and it's cousins do not easily find.
look at these sites for more info (or do your own search on "deep web")

http://library.albany.edu/internet/deepweb.html

http://websearch.about.com/cs/invisibleweb/


Post - Top - Home - Printer Friendly - Translate

Subject: RE: Help: search engines
From: JohnInKansas
Date: 31 Jan 04 - 06:36 PM

Bill D -

Your first link above is an excellent description of the "problem." Recommended reading for anyone really interested in deep stuff.

The second one shows the extent to which the "easy ways" in to the deep stuff often cost money (and are hence inaccessible to misers like us). Quite a few (not all) of the resources listed there are "pay per click," "with paid membership," or "fee based" search resources. That's sort of a separate topic, but this is one area where I'll go with your usual willingness to work a little harder to find the "free way."

John


Post - Top - Home - Printer Friendly - Translate

Subject: RE: Help: search engines
From: Bill D
Date: 31 Jan 04 - 06:54 PM

yup! POOR misers, in some cases. The mathematics of probablity is interesting.....given 10s of millions of people putting up information and resources, the odds are high that someone will figure out and/or offer a free way to do 'almost' everything. Takes some diligent searches, though. Or, even better, hanging out in places where the information you want is posted & discussed. There are forums or various sorts for almost any category of knowlege...it is just a matter of locating the 'nexus'...like Mudcat or rec.music.folk for some things.


Post - Top - Home - Printer Friendly - Translate

Subject: RE: Help: search engines
From: JohnInKansas
Date: 31 Jan 04 - 07:05 PM

But Bill, probability also says that 99.9% of the people who come to a forum a looking for an answer. Since not knowing anything doesn't usually slow any of us down on offering comments and opinions, it can be s.o.o.o.o very difficult to figure out which one actually knows anything.

Nothin's free, especially when it's free; but I suppose doing a little work won't really kill us(?). [Opinion reserved]

John


Post - Top - Home - Printer Friendly - Translate

Subject: RE: Help: search engines
From: Stilly River Sage
Date: 31 Jan 04 - 07:26 PM

I'll try this again--first one vanished in cyberspace.

Bill, thanks for posting those links. I regularly post information about databases, search engines, and other stuff in a newsletter for the library where I work. I'll add a bit about this. I am the "how you get there from here" person, pointing to the resources several librarians and bibliographers work hard to set up. But the free stuff is just as interesting, sometimes moreso.

SRS


Post - Top - Home - Printer Friendly - Translate

Subject: RE: Help: search engines
From: GUEST
Date: 31 Jan 04 - 07:32 PM

probability also says that 99.9% of the people who come to a forum are looking for an answer.

I'm sorry but that's unmitigated nonsense that you've made up off the top of your head, John


Post - Top - Home - Printer Friendly - Translate

Subject: RE: Help: search engines
From: Bill D
Date: 31 Jan 04 - 08:01 PM

SRS...you're quite welcome (I ran onto those terms originally in a place that just posts 'interesting' links...The BlackStump) [that's part of what I mean by hanging out in places where info is posted]

John... I know what you mean....but forums come in different types..I try to find the ones where there are real discussions and trading of ideas. (my everyday spot for just keeping up on programs for Windows is alt.comp.freeware....other places & newsgroups discuss spam, programming..etc..) There ARE places where those with new & interesting info just like to show off..*grin*

guest.. you sure have a quaint way of 'discussing'. Have you ever considered a career as a diplomat? Perhaps to Iraq?


Post - Top - Home - Printer Friendly - Translate

Subject: RE: Help: search engines
From: GUEST
Date: 31 Jan 04 - 08:11 PM

Bill,

John said something that was decidedly wrong. I alerted him to the fact. Not too sure how that is quaint.


Post - Top - Home - Printer Friendly - Translate

Subject: RE: Help: search engines
From: Bill D
Date: 31 Jan 04 - 08:22 PM

well, I didn't agree with him totally, either, and I said so ...but I can't see how insulting someone who posts great information 99.9% of the time would help much. (At least I can debate the topic with him without feeling I need to lose my cookie and be obnoxious!)


Post - Top - Home - Printer Friendly - Translate

Subject: RE: Help: search engines
From: GUEST
Date: 31 Jan 04 - 08:29 PM

Telling someone that what they are saying is nonsense isn't obnoxious in my opinion. I've not got a cookie.

John posts a lot of great stuff. No arguement there


Post - Top - Home - Printer Friendly - Translate

Subject: RE: Help: search engines
From: Stilly River Sage
Date: 31 Jan 04 - 08:53 PM

argument. No extra e. But it hardly seems worth arguing about, does it, guest?

Get the point?


Post - Top - Home - Printer Friendly - Translate

Subject: RE: Help: search engines
From: Joe Offer
Date: 01 Feb 04 - 04:55 PM

John, you answered a question I've had for a long time - why don't Digital Tradition songs turn up in Google searches? Quite often, I'll do a Google search and a song will come up in Yet Another Digital Tradition which is a mirror of a (usually) older version of our database (I think it's up-to-date now). I'm experienced enough that I know to search our own database if I get results in "Yet Another DT," but I see that a number of people link to that DT mirror or even copy-paste lyrics from it and post them here.
Now I know why that happens - at least I think I do. I wonder, though - I thought our Forum threads were stored in a database that has a structure identical to that of our online Digital Tradition database - but our threads often turn up on Google. How come?
-Joe Offer-


Post - Top - Home - Printer Friendly - Translate

Subject: RE: Help: search engines
From: Bill D
Date: 01 Feb 04 - 11:17 PM

Joe...Google follows hyperlinks...if it finds a page somewhere that refers to a Mudcat thread, it will follow that far. I have done a search on my name and found Mudcat threads...but only one or two. After 7 years, I guess Mudcat gets mentioned in all sorts of places, and Google is bound to stumble over some of them, but it doesn't go on unless there are hyperlinks to OTHER threads in the one it finds.


Post - Top - Home - Printer Friendly - Translate

Subject: RE: Help: search engines
From: JohnInKansas
Date: 02 Feb 04 - 12:33 AM

Joe -

The first of the two links by Bill D at 31 Jan 04 - 06:14 PM has a pretty good description of the situation.

As I understand it, the crawlers that the search engines use to gather links can only read html. (It's not clear whether they bother with plain ASCII text.)

To get anything out of a data base, you have to formulate a "query" to ask for what you want, and the crawlers don't know what to ask for - they can only read what they find, and they don't do DB.

When the crawler follows a link, it may attempt to "index" anything on the site that it can read, although usually it relies pretty much on the page header for the description of contents – which it may index. The page header does probably determine how it indexes the link that brought it to the page.

It will attempt to follow any links it finds on the page (or sometimes on the whole site). So far as I know, there's no clear and consistent "standard practice" on whether a given engine tries to map and index whole sites, or only individual pages, but both are done to some extent.

If the crawler finds a link to something, and can associate it with a "subject," it may add the link to the search data file, but it can't actually go into a database to see if there's anything else there to index. If you click on the link, from the search page, you can see the individual record that was linked, but the site hasn't really been "searched out," it's just there because the link to that specific database record was found elsewhere. As Bill D notes above, if there are links in the record that gets opened it may also follow them, so there is some "penetration," but the crawlers can't scan the whole database to see what else is there.

If you're seeing more links to the mirror than to the DT, the most likely reason would be that more people have posted links to individual items at the mirror in some html forum. ONE person who put an html INDEX, with links to individual records, somewhere could account for entries in the mirror database being found by the search machines, even if the mirror database is not directly accessed by any search engine. It's still a crapshoot though, because the page where the index appeared has to have been searched, and there's no way to "force" that to happen.

It does raise the possibility that mudcat and DT presence on search result pages could be raised by a specific effort to post links to individual items in their databases on "searchable" (html site) pages. I'm not sure whether such a "shotgun" approach to advertising would be consistent with "mudcat policy" (or if there is a coherent policy from which to decide whether it should be done). Most likely, as with many such things, it would make some people happy and totally p.o. some others.

A link, strategically placed, to a posting at mudcat, or to a "home page" at DT, with "links to everything" might also work, but that may be stretching it a bit.

I've never been reluctant to link specific individual items, but posting a full index is something I'm willing to leave to the management, and I don't know of a "willing" site where such posts could be done.

People who really want their site to be found often "trade" links. The offer to let people link to another site is only partly to provide more helpful information to users. When you allow them to put a link on your site, you almost always get a link to yours on their site. This sort of doubles the odds that a search engine will find you, since a search of either site will put your "name in the book."

There are numerous "tricks" that people use to try to increase "hits" in the search engines, and some of them work in principle; but the engine managers are really pretty clever about eliminating "self serving trickery." A link to "DT" or to "Mudcat" isn't likely to do much. The engine will follow the link, and might even add that link to its result list; but it can't "get in" to see what's there (so it can't generate a list of songs in the DT by searching the DT).

To "increase presence" in search engine results, each link that the index finds must have an indexable "subject," or the site that it leads to must be "indexable" (html) on its own. Otherwise the link will likely be just discarded, or relegated to an obscure category that few will look for.

Side Note:

One of the "richest" sources for "unsearchables" is the "personal" and/or "subject" web pages of university staff and faculty. These pages usually are clear html, often with lots of links to database information, and frequently are quite scholarly. Unfortunately, even though these pages are themselves "searchable," they're mostly on .edu or .org sites, and don't get searched directly because of "search engine policies." (An example is Chris Witcombe. Don't know who he is, but it's a GREAT site – never hit by a search page so far as I know, although I haven't tried recently – I've got a bookmark. I lost the link a year ago, and it took a very long time to dig it out again, because I got no usable links from the search engines.)

John


Post - Top - Home - Printer Friendly - Translate

Subject: RE: Help: search engines
From: JohnInKansas
Date: 02 Feb 04 - 04:01 AM

Joe -

Appropriate, in a backhanded sort of way, to the question of how searches work, is the link to Google Bombings posted by CarolC in the BS: Miserable Failure thread.

Apparently they can be spoofed.

John


Post - Top - Home - Printer Friendly - Translate

Subject: RE: Help: search engines
From: Jim Dixon
Date: 02 Feb 04 - 01:39 PM

JohnInKansas: If you search for "These pages are maintained by Chris Witcombe, Professor of Art" (in quotes) you find his site, although, interestingly, it's not the first site listed. There are a lot of sites that link to his site and include that quote in their own sites.

I doubt that there is any bias against web sites ending in .edu or .org—at least not in Google. I have found lots of academic web sites using Google. In fact, it is often useful to limit your search that way if you want to eliminate the chaff. For example, try searching for

"folk music" site:.edu


Post - Top - Home - Printer Friendly - Translate

Subject: RE: Help: search engines
From: Stilly River Sage
Date: 02 Feb 04 - 02:41 PM

I concur with Jim. I end up with edu sites all of the time when I use Google--perhaps it is the nature of the search. I also use Google Advanced most of the time, and the boolean search may have something to do with my results. You can also specify the extension to look for. I just entered [edu] and [Mudcat] as the search term. These are in the first few hits:

http://www2.truman.edu/~adavis/folklinks.html A link to the site and a description.

http://www.lib.washington.edu/resource/search/ResFull.asp?Field=record&ID=19260. A librarian has set up a bibliographic record for the website. Students at the UW will find this reference along with books, journals, etc. in their library catalog. There are many more instances of discussion lists and links posted to Mudcat, etc.

SRS


Post - Top - Home - Printer Friendly - Translate

Subject: RE: Help: search engines
From: JohnInKansas
Date: 02 Feb 04 - 07:12 PM

Jim Dixon -

If you read the entire "privacy statement" for the Google toolbar, there is, or was when I read it a few months ago, a statement that Google does not directly index .org and .edu sites.

What this really means, probably, is that they don't initiate crawlers in these sites. The crawlers will find them if a site that they hit has links to lead them there, so you do get results that will take you there. Most of the Google results that point to these sites do seem to come from links posted on the "commercial" web rather than from searches of and in the .edu and .org sites themselves.

Jim Dixon -

Unfortunately, the search phrase you used (there's always one that will work, if you can find it) wasn't one that I tried. I had copied some rather large text selections from the site, and searched for quite a few keyword and key phrase selections from the page, with no success. I did find the site when one of the "hits" lead me back to the place that had originally led me to Witcombe.

(Since then, I've learned to always paste the site addy (from the Address Bar) in any text I think is worth saving. Sometimes the URL printed in footer gets truncated, so that's not completely reliable.)

As you say, the phrase you searched for appears on several other sites, some of which may have been "mined;" but this is probably another case of a link to the site being noted by Google only because the link appeared on a site that was indexed. Since the content of the Witcombe site didn't produce a result (when I searched for it about a year ago) the site itself had apparently NOT been searched and had not had it's content indexed at that time.

Google will report many .edu and .org sites, if you do searches for something "in the header," simply because the sites are referenced elsewhere. You will get some results searching for site content, but only if the specific item of content has been cited and linked elsewhere.

So far as I have been able to determine, Google and other common search engines do not "read the contents" of .org and .edu sites to index their content in the same way that they do the website of your local newspaper. This was (but might change) a stated policy in one Google document that I found. Polices and practices do change, and the index results are a very fluid thing. Different days give different results.

John


Post - Top - Home - Printer Friendly - Translate

Subject: RE: Help: search engines
From: The Fooles Troupe
Date: 02 Feb 04 - 10:26 PM

Before you look at this site after having read this far down this thread, place your coffee cup down.... (assuming it is still on line...)

The Fooles Troupe Unofficial Zangelding Home Page
http://homepage.powerup.com.au/~rhayes/zangeld/zangldhm.htm

Join the Zangelding Webring
http://edit.webring.org/cgi-bin/membercgi?ring=zangelding;addform

Robin


Post - Top - Home - Printer Friendly - Translate

Subject: RE: Help: search engines
From: Stilly River Sage
Date: 02 Feb 04 - 10:47 PM

Robin,

What is the point of those pages except to torque up my dyslexia?


Post - Top - Home - Printer Friendly - Translate

Subject: RE: Help: search engines
From: The Fooles Troupe
Date: 02 Feb 04 - 10:52 PM

Zangelding....


Post - Top - Home - Printer Friendly - Translate

Subject: RE: Help: search engines
From: The Fooles Troupe
Date: 02 Feb 04 - 10:59 PM

Have you ever really thought, just how many link lists exist in the World Wide Web? Many of those link lists about certain topics are just link lists to other pages, that should give some information about a certain topic, but are actually just yet other link lists. Sometimes it is so hard to find any real information about a certain topic - there exists only links.

What if one day it should happen, that the WWW has only links to other pages about certain topics? And those other pages have only links to yet other pages about that topic, and those pages have only links to new pages about that topic and so on?


Post - Top - Home - Printer Friendly - Translate

Subject: RE: Help: search engines
From: Bill D
Date: 02 Feb 04 - 11:29 PM

no, I hadn't thought much about that


Post - Top - Home - Printer Friendly - Translate

Subject: RE: Help: search engines
From: JohnInKansas
Date: 03 Feb 04 - 05:30 AM

There is a theory(?) that "All the movie actors and actresses in the US constitute a 7-web."

If you start with actor A, who was in a movie with actress B, who was in a movie with actor C, etc, you can "link" any actor/actress who has ever been credited in a movie to any other actor/actress who has ever been credited in a movie - in no more than 7 links.

Some fanatics reportedly spend vast amounts of energy and effort trying to find "pairs" that require at least 8 links. Of course, when one of them reports an 8-link association, other fanatics must spend endless hours and energy trying to find a "shorter path" to connect that pair in no more than 7-links.

While the "shortest path" problem is usually considered as having no general solution other than by trial and error, statistical analyses have estimated a "web size" for the linked parts of the internet at something between a 17-web up to possibly a 23-web. (These numbers change frequently, depending on who you listen to.)

In principle, from any site (that has links to any other site(s)), you should be able to get to any destination site (that has links to other sites) in no more than 17 (or maybe it's 23) jumps.

This is, of course, only an "odds are" kind of estimate, since it's easy to build in "special sites" (with few links, usually) that are more remote.

A principal difficulty in making more specific estimates of the web-dimension comes from clusters of sites that have lots of internal links, but have few links to other clusters. Even a few "sparsely connected" clusters can greatly inflate the "shortest path" size of the web.

In additon, some estimates put the "webs" that have NO links to or from this "public internet" we live in at 500 to 800 times the size of "our internet." In many cases you can access this "other internet" if you know the address of a site there, but it's not generally accessible by "blue clicky."

John


Post - Top - Home - Printer Friendly - Translate

Subject: RE: Help: search engines
From: Stilly River Sage
Date: 03 Feb 04 - 10:43 AM

Six degrees of separation, ala the web.

The term "Zangelding" means nothing--I find nothing in my dictionary, and the portion "zan" isn't a root that means anything. "Gelder" I could play with, but still, it's nonsense. Is that what it is supposed to be?

SRS


Post - Top - Home - Printer Friendly - Translate

Subject: RE: Help: search engines
From: Bill D
Date: 03 Feb 04 - 11:24 AM

" Is that what it is supposed to be?"

It's best to have questions like that answered by folks who have studied obtuse aspects of phenomenological physics! I'm sure they can help you.


Post - Top - Home - Printer Friendly - Translate

Subject: RE: Help: search engines
From: The Fooles Troupe
Date: 03 Feb 04 - 06:19 PM

SRS

"The term "Zangelding" means nothing--I find nothing in my dictionary, and the portion "zan" isn't a root that means anything. "Gelder" I could play with, but still, it's nonsense. Is that what it is supposed to be?"

Not meaning to be offensive, but you have answered your own question - and you obviously won't need to put your coffee down before reading the site...

Some people just don't get "Theatre of the Absurd" immediately (The Goons & Monty Python, et al) - and it really can't be explained easily...

Robin


Post - Top - Home - Printer Friendly - Translate

Subject: RE: Help: search engines
From: The Fooles Troupe
Date: 03 Feb 04 - 06:37 PM

Interesting Bill D (02 Feb 04 - 11:29 PM)

There is no link to the Fooles Zangelding Page, just to a served assembled HTML page that has a link to the page.... I wonder if in the future, this current new mention of Zangelding will show up too? :-) An interesting project to see just how long it takes too... :-)

And why does AlltheWeb keep offering to search for the Fooles Zangelding page at a wrong address?

It's all too much for me... I'm not really a mathematican but I think I prefer to work it out with pencil and paper... :-)

Robin


Post - Top - Home - Printer Friendly - Translate

Subject: RE: Help: search engines
From: Burke
Date: 03 Feb 04 - 06:45 PM

John,
I looked at several pages at Google. Not only did I not find anything about omitting .edu and .org pages, but there is some specific ability to search .edu locations. Put Mudcat in the search engine & Mudcat.org is at the top of the list.

You are right about not searching library catalogs (the card files are gone), but that because they are very large databases, which are not searched. I must admit, though, that if I searched a book title & got a separate hit for every library that owned it, I'd not be happy. I find hits in online bibliographies frustrating enough.


Post - Top - Home - Printer Friendly - Translate

Subject: RE: Help: search engines
From: JohnInKansas
Date: 04 Feb 04 - 08:12 AM

Burke

Google searches mainly for links. The reference to not searching .edu and .org sites that I found was in a "privacy policy" statement that I read very thoroughly when some people were complaining about the "hit count" tracking that Google does if you install the toolbar. I didn't save the policy, but my recollection is that you had to follow about "three links deep" in "associated policies" and the whole "policy documentation" was something like 75 pages of fine print. I printed, but have discarded, about a third of it.

Saying that they don't search these sites does not mean that they don't index links to them that are found anywhere else.

Size of database has no effect. None of the common search engines read database information. [A side-benefit, they don't (can't) read anything in a "text box" or "frame" on a page that they search, so they don't (usually) index pop-ups.]

There is no specific ability to search .edu locations. There is the ability to limit the search to "links to .edu sites."

If the links were found on any site that has been searched, they will be indexed. The site that has been searched will not normally have a .edu or .org name, although many links to .edu and .org sites may appear on it, and will (maybe) be indexed.

You can search with the limit of finding only "links to .edu sites listed in the Google database."

The "Witcombe" site that I cited previously is fairly easy to find now, because it has been linked many times in the "open" web. The site itself (.edu) has apparently NOT been searched, since almost none of the 7,000(?) or so links there are indexed - or at least don't appear in the results Google will give you. I had trouble re-finding the page a year ago, because all I had at hand to look for was the content "inside" the page, which had apparently not been searched by Google.

Lots of .edu links appear on .com pages. They will be indexed when the .com page is searched. You can look only for .edu links, and there are quite a lot of them to look at. Links that appear only on .org or .edu pages will not generally be found by Google, or by any of the other common engines.

Many database sites have their own "local" search engines, so once you get into the site, you can sometimes extend the search on the local system. This is how you can search a database, using the database itself - if you can find the front door. (You do run into quite a few sites where you have to be a "member," or otherwise qualified, to use the search.) There are also some "special interest" search engines (like ArtCyclopedia, cited above) that find stuff that Google misses.

An interesting question is how the Google image search works. My own "working thesis" is that Google has no way of finding an "image." (They're often accessible only through a database, and/or are framed on any page where they're displayed.) It looks like (working hypothesis only) Google can only find links to images so it can only show you an image if someone has posted a link to the image. (The link may be on the same page, or at least the same site where the image appears, often a "click to view;" but "no link = no Google.") If you have to click in a thumbnail, to view the "full scale" picture, the "link" is hidden from Google because it can't (or just doesn't) "select" objects on the page to look "inside" them. Do your own tests on this one. I'm still working on it.

John


Post - Top - Home - Printer Friendly - Translate

Subject: RE: Help: search engines
From: Stilly River Sage
Date: 04 Feb 04 - 09:21 PM

Robin,

It took a trip to the dictionary to find nothing that tied in with the term, but what makes nonsense nonsense is its referent to the sensical. Nonsense itself still manages to convey clues that it is indeed nonsense. That's the part that wasn't registering, and there's no loop.

Bill, that's some site you linked to. I'll send that to the friend who I took my first Grad-level theory class from. :)

SRS


Post - Top - Home - Printer Friendly - Translate

Subject: RE: Help: search engines
From: Bill D
Date: 04 Feb 04 - 11:10 PM

it seems to me that Google looks for images on pages with the term you ask for in text or HTML, then shows you all the images on that page, whether or not they are relevant.

The only search engine I am aware of that looks for named images is AltaVista...(which is, unfortunately, also the only search engine that will allow you to specify UPPERCASE... (lordy, I have SO often wanted to limit searches to Capitalized words!)


Post - Top - Home - Printer Friendly - Translate

Subject: RE: Help: search engines
From: JohnInKansas
Date: 04 Feb 04 - 11:23 PM

SRS -

I guess I'm not inquisitive enough to get what the discussion's about here. I went to the Zangelding page and clicked on "What is Zangelding" and got:

"Zangelding is nothing. Zangelding means nothing. Zangelding is a non-existent abstract no-thing and there is no information about it, only links to other pages with link lists of links to pages about Zangelding. And those other pages do not have any real information, either. They just have links to other pages of link lists about Zangelding. And so on."

It seemed so perfectly logical to me, that I didn't feel the need to analyze.

The only question is, do you pronounce it "Zan – gelding," which sounds like something sort of sadistic (especially if one were a Zan), or is it "Zangel – ding." If it's the latter, I suppose having one's Zangel dinged could be painful, but if you apply a little upward inflection on the last syllable, … - … - … it's sort of musical.

I think, pending further input, I'll use the nice version of the second pronunciation. Could we make a song?

John


Post - Top - Home - Printer Friendly - Translate

Subject: RE: Help: search engines
From: Amos
Date: 05 Feb 04 - 12:20 AM

Google's image searching is based on test about the image, not the image per se. However, database search technology for actually finding images similar to some selected color, shape, or combination of forms in the abstract has been around since the 90's.

A


Post - Top - Home - Printer Friendly - Translate

Subject: RE: Help: search engines
From: Stilly River Sage
Date: 05 Feb 04 - 12:51 AM

John, perhaps it is the inquiring mind that is looking for the word root, the possible access to a pun, whatever. This one is a clunker, is all. As I mentioned above, when I scanned a page, all it seemed to do was trigger a dyslexic reaction, I simply couldn't read the page.

I use google to search images frequently. My results usually include a mix of image links with the name I'm looking for, i.e. "Rudbeckia" will bring me links to the flower image that has the name "rudbeckia.jpg" and there will be pages in which "rudbeckia" is part of the text. Either/and results. I think it must be doing some true image searching, Amos, because on a page with the word "rudbeckia" in the text, the one image it is liable to land on in the thumbnail linking to the page is the image I was looking for, as distinct from all other flower images also on that given page.

SRS


Post - Top - Home - Printer Friendly - Translate

Subject: RE: Help: search engines
From: JohnInKansas
Date: 05 Feb 04 - 01:31 AM

Amos

Actually some "publicly available" image matching software has been around since more like the 60s, although there have been some discrete incremental improvements that brought it bursts of publicity sporadically. Performance was never robust enough to keep it up where people noticed and tried to make much use of it. The FBI has claimed to use "pattern matching" programs for searching fingerprint data, a somewhat simplistic form of image matching, since at least the mid 50s; but published results indicate rather meager performance until much later. It was possibly about the start of the 90s when some "name" databases put systems more or less "up front" where people got some use out of them.

My recollection is that Corel Draw packaged a crude version for consumer use briefly in the early 90s. It was supposed to make it easier for you to search your thumbnails to find the one you wanted. One difficulty is that Corel never gave users any clues about what the "match" criteria were, so when you clicked an image and asked it to find the ones like it, you never knew if you'd get all the yellow ones, all the ones with two round things, or the ones that showed exactly 3 teeth. The retrievals often looked more like random picks than anything.

Based on my observation of results from Google image search, I have to continue with my (tenative) notion that they search first for links, sometimes for text, and the image search just returns anything turned up by those searches that has an image file type. Results can be quite "rich," but the sources all seem more "news, gossip, and business" site oriented than academic or art world.

One recent search for a classic painting turned up 11 hits on the same painting by Munch (a rather nasty looking thing, and all the same one of at least 6 or 7 that he did with the same title) and more than half of them were on blog sites. A few statues with similar titles but with bad personal photos, probably from someone's vacation in Italy and/or France; but no hits for any of the two dozen or so "known" (even to me) artists who did well known works with that title, and none of the dozen or so web museum sites where I've seen postings of paintings with the exact title searched - in 45 pages of results.

John


Post - Top - Home - Printer Friendly - Translate

Subject: RE: Help: search engines
From: The Fooles Troupe
Date: 05 Feb 04 - 08:14 PM

The Zangelding thing was started by some one who is in one of the top links - a site in sweden I think - they were tickled when I emailed them - they may have lost their site by now - I think it was at a .edu site -

Robin


Post - Top - Home - Printer Friendly - Translate

Subject: RE: Help: search engines
From: Stilly River Sage
Date: 05 Feb 04 - 08:19 PM

This possibly explains why it doesn't work with an English dictionary!


Post - Top - Home - Printer Friendly - Translate
  Share Thread:
More...

Reply to Thread
Subject:  Help
From:
Preview   Automatic Linebreaks   Make a link ("blue clicky")


Mudcat time: 23 November 5:44 PM EST

[ Home ]

All original material is copyright © 1998 by the Mudcat Café Music Foundation, Inc. All photos, music, images, etc. are copyright © by their rightful owners. Every effort is taken to attribute appropriate copyright to images, content, music, etc. We are not a copyright resource.