Forum indexLoginForgotten passwordSearchFAQ
Author Topic: ConFetch Blocking methodology...
Axn...

View details
ConFetch Blocking methodology...
on 12/11/2005 19:03 (UTC)
Hello,
I thought I should mention how I'm determining which
sites are blocked at the second level with ConFetch.
It seems that DebugViewNT has a nifty way of showing
how (the even niftier) TreeWalk loads the filter.conf
in 'descending specificity'...

And it can be captured to file, which looks like this:

<snippet>
zone information.com/IN: loaded serial 0	
zone dp.information.com/IN: loaded serial 0	
zone search.information.com/IN: loaded serial 0	
zone searchportal.information.com/IN: loaded serial 0
</snippet>

...so-o-o, this shows that somewhere in the files we've
downloaded and merged, one of our Master Lists' authors
have blacklisted the second level domain name... By
searching the DebugView log for these types of entries
and determining that the domain name is listed ahead
of (at least) 3 of it's subdomains, we can remove all
other instances of "information.com" from our config and
still block all access to the subdomain listings that are
required by a hosts file, but not by TreeWalk. Fairly
basic, I think... 

There's always a possibility that I may have "accidentally"
blocked a second level name that is not listed in the
MLs, so please let me know if you find one and I'll fix
it...

This 'entire process' can be monitored pretty easily for
future "server farm" type activity with a simple script,
and once there are a few "collected", I'll update the
ConFetch routine to further condense our TW list. You'll
still be (additionally) protected (without ads ) by
regularly running ConFetch in the meantime[s]...

You never know, but perhaps our "filter.conf" can be
placed online somewhere someday so folks can "fetch" it
regularly. Does anyone have any thoughts on this? ATB. 

Regards, Axn... (Delete caps and dash for reply)
Jim Byrd
Re: the HOSTS file and ConFetch Blocking methodology...
on 13/11/2005 05:01 (UTC)
Hi Axn and all - Was just re-reading this and had a thought that the way I'm
handling things might be of some interest/use to others, so I'll tell you
about it, FWIW.

First, for those of you not so familiar with the subject of HOSTS files, a
small tutorial:

First, you should understand that the original purpose of the HOSTS file
(BTW, it should always be named this way - all caps, no extension) was to
provide a local (therefore fast) translation from URLs to IP addresses for
frequently visited sites (typically your Favorites).  It can still be used
this way (I do so, for example - there are utilities available such as CIP,
http://dl.winsite.com/bin/downl?500000007704 which will convert your
Favorites to IP's which you can then saveas and then copy into your HOSTS
file.)

It has also come to be used to block ad/malware servers by redirecting them
to your local machine instead of their servers using this same mechanism.
See here for some good info about this use:
http://www.mvps.org/winhelp2002/hosts.htm  This site also has downloads for
some utility programs which you will find useful if you decide to use a
HOSTS file such as RenHosts.bat,
http://www.mvps.org/winhelp2002/RenHosts.bat, and lockhosts.bat and
unlockhosts.bat, http://www.mvps.org/winhelp2002/lockhost.bat, and
http://www.mvps.org/winhelp2002/unlockhost.bat.  The lock and unlock files
can be used to protect the HOSTS file in between UPDATES so that it doesn't
get hijacked by malware, while the rename hosts program will allow you to
easily enable or disable the HOSTS file (while keeping the correct naming
convention).  An even more convenient lock/unlock solution with additional
capabilities is HostMan, here:  http://hostsman.abelhadigital.com/.

As to size/performance - with any relatively modern computer the delay added
by the HOSTS lookup overhead should be negligable for even moderately large
HOSTS files (typically 250KB to 500KB) used for ad/malware blocking.  If you
use it also for DNS-to-IP caching as I refered to above, the time saved over
going out to the net for DNS lookups will offset this many times.  If fact
you may notice some speedup in "normal" address browsing.



OK, so more than you ever wanted to know, right?  Now for what I started off
promising:

With the advent of Axn's neat confetch package which in conjunction with
ObiWan's neat package (TW, that is       ) which now allows us to very
conveniently block unwanted URL's within TreeWalk, that secondary function
of the HOSTS file is really no longer necessary (particularly in my case
since Axn's choice of files just happens to be the one's I was previously
using - YMMD, however.)  As I mentioned above, I also use my HOSTS file for
high speed URL-to-IP address resolution, so I've now limited it to just that
function, and perform no blocking there at all.

Now something significant to note is that the HOSTS file is consulted BEFORE
TreeWalk's functions are involked.  This allows me (and you if you use
confetch) to add to my HOSTS file sites that I DON'T want blocked by TW even
though they may be on "the list" (for whatever reason) in addition to it's
normal Favorites complement.  I just open a CMD window and do an    nslookup
<site URL>      to get the IP and then add it to my HOSTS file in the same
format used for my Favorites, e.g.,   xxx.xxx.xxx.xxx   URL.  This allows me
to access the site without having to disable TW if that would otherwise have
been required.  This won't solve the "sub-domains-we'd-like-to-keep" issue,
however, since the HOSTS file resolves or blocks only on the full domain
name.  (BTW, be sure to keep the 127.0.0.1 localhost line and a trailing
blank line in your HOSTS file.)

Perhaps this will be of use to some of you anyway.

Regards, Jim Byrd, MS-MVP/DTS/AH-VSOP
My Blog, Defending Your Machine, here:
http://DefendingYourMachine.blogspot.com/

"Axn" <axnX-SPAM@FALSEshaw.ca> wrote in message
news:74124085$5e76f238$7b1@news.ntcanuck.com
> Hello,
> I thought I should mention how I'm determining which
> sites are blocked at the second level with ConFetch.
> It seems that DebugViewNT has a nifty way of showing
> how (the even niftier) TreeWalk loads the filter.conf
> in 'descending specificity'...
>
> And it can be captured to file, which looks like this:
>
> <snippet>
> zone information.com/IN: loaded serial 0
> zone dp.information.com/IN: loaded serial 0
> zone search.information.com/IN: loaded serial 0
> zone searchportal.information.com/IN: loaded serial 0
> </snippet>
>
> ...so-o-o, this shows that somewhere in the files we've
> downloaded and merged, one of our Master Lists' authors
> have blacklisted the second level domain name... By
> searching the DebugView log for these types of entries
> and determining that the domain name is listed ahead
> of (at least) 3 of it's subdomains, we can remove all
> other instances of "information.com" from our config and
> still block all access to the subdomain listings that are
> required by a hosts file, but *not* by TreeWalk. Fairly
> basic, I think... ;)
>
> There's always a possibility that I may have "accidentally"
> blocked a second level name that *is* *not* listed in the
> MLs, so please let me know if you find one and I'll fix
> it...
>
> This 'entire process' can be monitored pretty easily for
> future "server farm" type activity with a simple script,
> and once there are a few "collected", I'll update the
> ConFetch routine to further condense our TW list. You'll
> still be (additionally) protected (without ads <g>) by
> regularly running ConFetch in the meantime[s]...
>
> You never know, but perhaps our "filter.conf" can be
> placed online somewhere someday so folks can "fetch" it
> regularly. Does anyone have any thoughts on this? ATB. :)
Axn...

View details
Re: the HOSTS file and ConFetch Blocking methodology...
on 13/11/2005 06:20 (UTC)
Jim Byrd wrote:

> Hi Axn and all - Was just re-reading this and had a thought that the
> way I'm handling things might be of some interest/use to others, so
> I'll tell you about it, FWIW.
> 
> First, for those of you not so familiar with the subject of HOSTS
> files, a small tutorial:
> 
> First, you should understand that the original purpose of the HOSTS file
> (BTW, it should always be named this way - all caps, no extension) was
> to provide a local (therefore fast) translation from URLs to IP
> addresses for frequently visited sites (typically your Favorites).  It
> can still be used this way (I do so, for example - there are utilities
> available such as CIP, http://dl.winsite.com/bin/downl?500000007704
> which will convert your Favorites to IP's which you can then saveas and
> then copy into your HOSTS file.)
> 
> It has also come to be used to block ad/malware servers by redirecting
> them to your local machine instead of their servers using this same
> mechanism.  See here for some good info about this use:
> http://www.mvps.org/winhelp2002/hosts.htm  This site also has downloads
> for some utility programs which you will find useful if you decide to
> use a HOSTS file such as RenHosts.bat,
> http://www.mvps.org/winhelp2002/RenHosts.bat, and lockhosts.bat and
> unlockhosts.bat, http://www.mvps.org/winhelp2002/lockhost.bat, and
> http://www.mvps.org/winhelp2002/unlockhost.bat.  The lock and unlock
> files can be used to protect the HOSTS file in between UPDATES so that
> it doesn't get hijacked by malware, while the rename hosts program will
> allow you to easily enable or disable the HOSTS file (while keeping the
> correct naming convention).  An even more convenient lock/unlock
> solution with additional capabilities is HostMan, here:
> http://hostsman.abelhadigital.com/.
> 
> As to size/performance - with any relatively modern computer the delay
> added by the HOSTS lookup overhead should be negligable for even
> moderately large HOSTS files (typically 250KB to 500KB) used for
> ad/malware blocking.  If you use it also for DNS-to-IP caching as I
> refered to above, the time saved over going out to the net for DNS
> lookups will offset this many times.  If fact you may notice some
> speedup in "normal" address browsing.
> 
> 
> 
> OK, so more than you ever wanted to know, right?  Now for what I
> started off promising:
> 
> With the advent of Axn's neat confetch package which in conjunction with
> ObiWan's neat package (TW, that is       ) which now allows us to very
> conveniently block unwanted URL's within TreeWalk, that secondary
> function of the HOSTS file is really no longer necessary (particularly
> in my case since Axn's choice of files just happens to be the one's I
> was previously using - YMMD, however.)  As I mentioned above, I also
> use my HOSTS file for high speed URL-to-IP address resolution, so I've
> now limited it to just that function, and perform no blocking there at
> all.
> 
> Now something significant to note is that the HOSTS file is consulted
> BEFORE TreeWalk's functions are involked.  This allows me (and you if
> you use confetch) to add to my HOSTS file sites that I DON'T want
> blocked by TW even though they may be on "the list" (for whatever
> reason) in addition to it's normal Favorites complement.  I just open a
> CMD window and do an    nslookup <site URL>      to get the IP and then
> add it to my HOSTS file in the same format used for my Favorites, e.g.,
> xxx.xxx.xxx.xxx   URL.  This allows me to access the site without
> having to disable TW if that would otherwise have been required.  This
> won't solve the "sub-domains-we'd-like-to-keep" issue, however, since
> the HOSTS file resolves or blocks only on the full domain name.  (BTW,
> be sure to keep the 127.0.0.1 localhost line and a trailing blank line
> in your HOSTS file.)
> 
> Perhaps this will be of use to some of you anyway.

Excellent Jim! Thank you very much! I may ask to "borrow it"
some day!!!  ATB...

Regards, Axn... (Delete caps and dash for reply)
The Engine
Re: the HOSTS file and ConFetch Blocking methodology...
on 13/11/2005 10:41 (UTC)
"Jim Byrd" <jrbyrd@spamlessadelphia.net> wrote in
news:ce334e9$7a3e70fe$7b7@news.ntcanuck.com: 

 
> Now something significant to note is that the HOSTS file is consulted
> BEFORE TreeWalk's functions are involked.  This allows me (and you if
> you use confetch) to add to my HOSTS file sites that I DON'T want
> blocked by TW even though they may be on "the list" (for whatever
> reason) in addition to it's normal Favorites complement.  I just open
> a CMD window and do an    nslookup <site URL>      to get the IP and
> then add it to my HOSTS file in the same format used for my Favorites,
> e.g.,   xxx.xxx.xxx.xxx   URL.  This allows me to access the site
> without having to disable TW if that would otherwise have been
> required.  This won't solve the "sub-domains-we'd-like-to-keep" issue, 
> however, since the HOSTS file resolves or blocks only on the full
> domain name.  (BTW, be sure to keep the 127.0.0.1 localhost line and a
> trailing blank line in your HOSTS file.)
> 
> Perhaps this will be of use to some of you anyway.

Jim, thanks very much for this post.

Previously, I used the MVPS HOSTS file, but after installing Treewalk and 
using confetch, I reinstated my HOSTS file back to the original, with 
just the localhost line it it. I didn't do this from any great knowledge 
base, I just thought that it would be stupid to have too adblocking 
systems running at the same time!!!

I'll now look at your links and figure out how to add my favorites to it 
too. I'll also bear in mind that I can add any of the confetch blocked 
sites to HOSTS to enable access, although at the moment I can't think of 
any that I would even want too. 

Thanks again.
Jim Byrd
Re: the HOSTS file and ConFetch Blocking methodology...
on 13/11/2005 21:48 (UTC)
YW, Engine - Well, there can be legitimate reasons to unblock certain
sites - let me give you one example.  I monitor accesses to my Blog using
StatCounter.  This will normally show up on these lists because it trys to
install a tracking cookie.  Since I control cookies and block tracking
cookies another way (see my Blog!      ), I can afford to unblock the
various StatCounter URL's using the method I outlined for you without risk
exposure in this case.  There are others as well.  I will sometimes need to
go to sites which attempt to install malware in order to "capture" it for
sandboxing in order to examine it, for example, and in this case I will
override momentarily on that machine. (NOT a recommended activity for
others, BTW, if you even suspect 'malware' activity on the site!        )
Of course, YMMV.

Regards, Jim Byrd, MS-MVP/DTS/AH-VSOP
My Blog, Defending Your Machine, here:
http://DefendingYourMachine.blogspot.com/

"The Engine" <alt.29.theengine@spamgourmet.com> wrote in message
news:Xns970D6CA02A217alt29theenginespamgo@217.155.11.108
> "Jim Byrd" <jrbyrd@spamlessadelphia.net> wrote in
> news:ce334e9$7a3e70fe$7b7@news.ntcanuck.com:
>
>
>> Now something significant to note is that the HOSTS file is consulted
>> BEFORE TreeWalk's functions are involked.  This allows me (and you if
>> you use confetch) to add to my HOSTS file sites that I DON'T want
>> blocked by TW even though they may be on "the list" (for whatever
>> reason) in addition to it's normal Favorites complement.  I just open
>> a CMD window and do an    nslookup <site URL>      to get the IP and
>> then add it to my HOSTS file in the same format used for my Favorites,
>> e.g.,   xxx.xxx.xxx.xxx   URL.  This allows me to access the site
>> without having to disable TW if that would otherwise have been
>> required.  This won't solve the "sub-domains-we'd-like-to-keep" issue,
>> however, since the HOSTS file resolves or blocks only on the full
>> domain name.  (BTW, be sure to keep the 127.0.0.1 localhost line and a
>> trailing blank line in your HOSTS file.)
>>
>> Perhaps this will be of use to some of you anyway.
>
> Jim, thanks very much for this post.
>
> Previously, I used the MVPS HOSTS file, but after installing Treewalk and
> using confetch, I reinstated my HOSTS file back to the original, with
> just the localhost line it it. I didn't do this from any great knowledge
> base, I just thought that it would be stupid to have too adblocking
> systems running at the same time!!!
>
> I'll now look at your links and figure out how to add my favorites to it
> too. I'll also bear in mind that I can add any of the confetch blocked
> sites to HOSTS to enable access, although at the moment I can't think of
> any that I would even *want* too.
>
> Thanks again.