Comparison of free traffic accounting programs SQUID. Configuration and basic parameters of SQUID Paired parameters of the squid model

Recently, our company needed to transfer a proxy server from MS ISA Server to free software. It didn’t take long to choose a proxy server (squid). Using several practical recommendations, I configured the proxy to suit our needs. Some difficulties arose when choosing a program for traffic accounting.

The requirements were:

1) free software
2) the ability to process logs from different proxies on one server
3) the ability to create standard reports with sending by mail, or a link on a web server
4) building reports for individual departments and distributing such reports to department heads, or providing access via a link on a web server

The developers provide very little information on traffic accounting programs: a laconic description of the purpose of the program plus an optional bonus of a couple of screenshots. Yes, it is clear that any program will calculate the amount of traffic per day/week/month, but additional interesting features that distinguish one program from others are not described.

I decided to write this post in which I will try to describe the capabilities and disadvantages of such programs, as well as some of their key features, in order to help those who have to make a choice a little.

Our candidates:

SARG
free-sa
lightsquid
squidanalyzer
ScreenSquid

Retreat

Information about the “age” of the program and the latest release is not a comparison parameter and is provided for information only. I will try to compare exclusively the functionality of the program. I also deliberately did not consider too old programs that have not been updated for many years.

Logs are sent to the analyzer for processing in the form in which squid created them and will not undergo any pre-processing to make changes to them. Processing of incorrect records and all possible transformations of log fields must be done by the analyzer itself and be present only in the report. This article is not a setup guide. Configuration and usage issues can be covered in separate articles.

So let's get started.

SARG - Squid Analysis Report Generator

The oldest among supported programs of this class (development started in 1998, former name - sqmgrlog). Latest release (version 2.3.10) - April 2015. After that there were several improvements and fixes that are available in the master version (can be downloaded using git from sourceforge).

The program is launched manually or via cron. You can run it without parameters (then all parameters will be taken from the sarg.conf configuration file), or you can specify parameters on the command line or script, for example, the dates for which the report is generated.

Reports are created as html pages and stored in the /var/www/html/squid-reports directory (by default). You can set a parameter that specifies the number of reports stored in the catalog. For example, 10 daily and 20 weekly, older ones will be automatically deleted.

It is possible to use several config files with different parameters for different report options (for example, for daily reports, you can create your own config, in which the option for creating graphs will be disabled and a different directory for outputting the report will be specified).

Details

When entering the main page with reports, we can select the period for which it was created (defined in the report creation parameters), the date of its creation, the number of unique users, the total traffic for the period, the average amount of traffic per user.

When you select one of the periods, we will be able to get a topusers report for this period. Below I will give descriptions and examples of all types of reports that SARG can make.

1) topusers - total traffic by users. A user is either the name of the host to which Internet access is granted, or the user's login. Sample report:

IP addresses are displayed here. When configured to enable the corresponding option, IP addresses are converted to domain names.

Are you using authentication? Accounts are converted to real names:

The appearance can be customized in a css file. The displayed columns are also customizable, and unnecessary ones can be removed. Column sorting is supported (sorttable.js).

When you click on the graph icon on the left, you will see a graph like this:

When you click on the icon on the right, we get report 5.

2) topsites - report on the most popular sites. By default, a list of the 100 most popular sites is displayed (the value can be adjusted). Using regular expressions or setting aliases, you can combine traffic from domains of the 3rd and higher levels to a 2nd level domain (as in the screenshot) or set any other rule. For each domain, you can set a rule separately, for example, for yandex.ru and mail.ru, combine up to the 3rd level. The meaning of the fields is quite obvious.

3) sites_users - a report on who visited a specific site. Everything is simple here: the domain name and who accessed it. Traffic is not shown here.

4) users_sites - a report on sites visited by each user.

Everything is clear here too. If you click on the icon in the first column, we get the report viii).

5) date_time - distribution of user traffic by day and hour.

6) denied - requests blocked by squid. This displays who, when and where access was denied. The number of entries is configurable (default is 10).

7) auth_failures - authentication failures. HTTP/407.
The number of entries is configurable (default is 10).

8) site_user_time_date - shows what time the user visited which site and from which machine.

9) downloads - list of downloads.

10) useragent - report on programs used

The first part of the report displays the IP address and used useragents.

In the second - a general list of useragents with distribution in percentage, taking into account versions.

11) redirector - the report shows who has had access blocked using the blocker. Squidguard, dansguardian, rejik are supported, the log format is customizable.

SARG has more than 120 settings parameters, language support (100% of messages are translated into Russian), support for regular expressions, work with LDAP, the ability to provide users with access only to their reports on the web server (via .htaccess), the ability to convert logs into their own format to save space, uploading reports to a text file for subsequent filling of the database, working with squid log files (splitting one or more log files by day).

It is possible to create reports for a specific set of specified groups, for example, if you need to make a separate report for a department. In the future, access to the web page with department reports can be provided, for example, to managers using a web server.

You can send reports by e-mail, however, for now only the topusers report is supported, and the letter itself will be simple text without HTML support.

You can exclude certain users or certain hosts from processing. You can set aliases for users, combining the traffic of several accounts into one, for example, all outstaffers. You can also set aliases for sites, for example, combine several social networks into a certain alias, in this case all parameters for the specified domains (number of connections, volume of traffic, processing time) will be summed up. Or, using a regular expression, you can “discard” domains above level 3.
It is possible to upload a list of users that have exceeded certain volumes for a period to separate files. The output will be several files, for example: userlimit_1G.txt - exceeding 1 Gb, userlimit_5G.txt - exceeding 5 Gb and so on - 16 limits in total.

SARG also has a couple of PHP pages in its arsenal: viewing current connections to squid and for adding domain names to squidguard block lists.

In general, this is a very flexible and powerful tool that is easy to learn. All parameters are described in the default configuration file; the project on sourceforge has a more detailed description of all parameters in the wiki section, divided into groups, and examples of their use.

free-sa

Domestic development. There have been no new versions since November 2013. Claims faster report generation than competing programs and less space required for completed reports. Let's check!

In terms of operating logic, this program is closest to SARG (and the author himself compares it with this program (for example,)), so we will compare it with it.

I was pleased that there were several design themes. The theme consists of 3 css files and 4 png icons corresponding to them.

Reports are actually done faster. The daily report was created at 4:30, when SARG's was 12 minutes. However, the volume occupied was not the case: the volume occupied by reports is 440 MB (free-sa) and 336 MB (SARG).

Let's try to give a more difficult task: process a 3.2 GB log file in 10 days, which contains 26.3 million lines.

Free-sa also made the report faster, in 46 minutes, the report takes up 3.7 GB of disk space. SARG spent 1 hour 10 minutes, the report takes 2.5 GB.

But both of these reports will be awkward to read. Who, for example, would want to manually calculate which domain is more popular - vk.com or googlevideo.com and manually count the traffic of all their subdomains? If you leave only 2nd level domains in the SARG settings, then creating a report will take about the same time, but now the report itself takes up 1.5 GB on disk (daily from 336 MB it has decreased to 192 MB).

Details

When entering the main page, we see something like the following (the blues theme is selected):

To be honest, the purpose of displaying the year and months is unclear; when you click on them, nothing happens. You can write something in the search field, but again nothing happens. You can select the period of interest.

List of blocked URLs:

CONNECT metdod report:

PUT/POST metdod report:

Popular sites:

The report on the effectiveness of the proxy server seemed interesting:

User report:

When you click on the graph icon in the second column, we get a graph of Internet usage by a specific user:

When you click on the second icon, we get a table of Internet channel loading by hour:

When you select an IP address, we get a list of sites by user in descending order of traffic:

All statistics are displayed in bytes. To switch to megabytes you need to set the parameter

reports_bytes_divisor="M"

The program does not accept compressed log files, does not accept more than one file with the -l parameter, and does not support filtering files by mask. The author of the program suggests circumventing these restrictions by creating named pipes.

An annoying glitch was discovered - when the length of the log line is too long, timestamps are entered instead of addresses:

When viewing the traffic of this “user” you can see the domain with the source of the error:

Thus, the number of users has increased several times.

If we compare these two programs, free-sa creates the report a little faster. I was not able to detect a 20-fold increase in speed, as stated by the author. Perhaps it can be seen under certain conditions. I think it doesn’t matter how long it takes to create a weekly report at night - 30 minutes or 50. In terms of the amount of space occupied by reports, free-sa has no advantage.

lightsquid

Perhaps the most popular traffic counter. It works quickly, reports do not take up much disk space. Although this program has not been updated for a long time, I still decided to consider its capabilities in this article.

The logic of the program is different: the program reads the log and creates a set of data files, which it then uses to create web pages. That is, there are no pre-created reports with data here; pages with data are generated on the fly. The advantages of this solution are obvious: to obtain a report, it is not necessary to parse all the logs for the period; it is enough to “feed” the accumulated log to lightsquid once a day. You can do this using cron several times a day, even several times a day, to quickly add a new piece of information.

There are some drawbacks: it is impossible to process logs from different servers and collect statistics in one place: when processing a log for a day from another server, the existing statistics for that day are erased.

There is a strange limitation: lightsquid “perceives” both uncompressed log files and compressed ones (gz - exactly), but in the second case the file name must be in the following format: access.log.X.gz, files with the name format access.log- YYYYMMDD.gz will not accept it.

Through simple manipulations we overcome this limitation and see what happens.

Details

The report for the month (total traffic 3 TB, 110 million lines) took up 1 GB of disk space.

On the home page we see traffic by day for the current month.

When you select a day, we see a report for the day for all users:

If groups are specified, the name of the group to which the user belongs is displayed in the right column. Users who are not members of any group are grouped into group 00 no in group (they are marked with a question mark in this report).

When you select grp on the main page for the corresponding date, you are taken to the user report page, divided into groups. Those not included in any group are listed first, then the groups in order.

When you click on the name of a group in the table on the right, we go below to the place on the page where the report for this group begins:

When you click on “Top sites report” we get a report on popular sites for the day:

Big files report:

Let's move on to the table on the right.
Here you can get a list of top sites for the month and for the whole year (they look the same, so no screenshot), general statistics for the year and month, as well as statistics for the year and month by group.

Statistics for the month:

By clicking on the clock icon we can see a table of sites, access time and traffic consumed per hour:

Statistics for the day are displayed here, but for the month and for the year it will look approximately the same, hourly statistics for domains will be summed up.

When you click on the graph icon, we can see the user’s traffic consumption during the month:

The graph columns are clickable: when you click on a column, you go to the user’s statistics for another day.

By clicking on [M], we will receive a report on the user’s traffic consumption during the month, indicating the volume for each day and for the full week.

When you click on the user's name, we get a list of sites that the user visited in descending order of traffic:

Well, that seems to be all. Everything is simple and concise. IP addresses can be converted to domain names. Using regular expressions, domain names can be combined into 2nd level domains; just in case, here is a regular expression:

$url =~ s/(+://)??(+.)(0,)(+.)(1)(+)(.*)/$3$4/o;

If you have skills in perl, you can customize it to suit your needs.

squidanalyzer

A program similar to lightsquid and also written in Perl. Prettier design. The latest version 6.4 was released in mid-December of this year, many improvements have been made. Program website: squidanalyzer.darold.net.

Squidanalyzer can use multiple processors on your computer (-j option), which results in faster reporting, but this only applies to uncompressed files. For packed ones (gz format is supported), processing occurs using one processor core.

And one more comparison with lightsquid: the same report on the same server took about a day, it takes up 3.7 GB on disk.

Just like lightsquid, squidanalyzer will not be able to combine two or more log files from different servers for the same period.

More details

Home page - you can select the year of the report.

When you select any period (year, month, week, day), the appearance of the web pages will be similar: at the top there is a menu with the following reports: MIME types, Networks, Users, Top Denied, Top URLs, Top Domains. Below are proxy statistics for the selected period: Requests (Hit/Miss/Denied), Megabytes (Hit/Miss/Denied), Total (Requests/Megabytes/Users/Sites/Domains). Below is a graph of the number of requests per period and traffic.

There is a calendar in the upper right corner. When you select a month, you can see brief statistics and a download graph by day:

The calendar allows you to select a week. When selected, we will see similar statistics:

When you select a day, you see statistics by hour:

Content Type Report:

Networks report.

User report.

When you select a user, we get his statistics for the period.

Prohibited resources:

Report on 2nd level domains.

On my own behalf, I would like to note the very slow operation of the program as information accumulates. With each new log, statistics for the week, month and year are recalculated. Therefore, I would not recommend this program for processing logs from a server with a lot of traffic.

screensquid

This program has a different logic: the log is imported into a MySQL database, then data is requested from it when working in the web interface. The database with the processed ten-day log mentioned earlier occupies 1.5 GB.

More details

The program cannot import log files with an arbitrary name, binding only to access.log.

Home page:

Brief statistics:

You can create aliases for IP addresses:

... and then they can be combined into groups:

Let's move on to the main thing - reports.

On the left is a menu with report types:

User traffic logins
IP address user traffic
Website traffic
Top sites
Top users
Top IP addresses
By time of day
User traffic logins expanded
IP address user traffic extended
IP address traffic with resolution
Popular sites
Who downloaded large files
Traffic by period (days)
Traffic by period (day name)
Traffic by period (months)
HTTP statuses
Login IP addresses
Logins from IP addresses

Examples of reports.

IP address user traffic:

Website traffic:

Top sites:

... further, to be honest, I didn’t have enough patience to study the possibilities, since the pages began to be generated in 3-5 minutes. The “time of day” report for the day, the log for which was not imported at all, took more than 30 seconds to create. For a day with traffic - 4 minutes:

That's all. I hope this material is useful to someone. Thank you all for your attention.

Good afternoon, dear readers and guests! With this article I will start describing the work SQUID caching proxy server. This article will mostly be introductory and theoretical.

What is a proxy server and what is squid

I'll start with the basics. squid is caching proxy server for HTTP, FTP and other protocols. Proxy server for HTTP is a program that makes HTTP requests on behalf of a client program (be it a browser or other software). Proxy maybe caching or non-caching. Caching, accordingly, saves all requests to some storage for faster delivery to clients, and non-caching- simply broadcasts HTTP, ftp or other requests. Previously, traffic caching made it possible to achieve quite significant traffic savings, but nowadays, with the increase in Internet speeds, this has lost its relevance a little. Proxy servers can be built in hierarchy to process requests. At the same time, proxy servers interact with each other using ICP protocol.

Squid designed and can run on most operating systems (both unix and windows). Licensed under the GNU GPL license. Capable of processing and caching HTTP, FTP, gopher, SSL and WAIS (removed in 2.6) requests, as well as DNS. The most frequent requests are stored in RAM. Currently there are 2 stable versions of squid: 2.7 And 3.1 . The differences can be found in the links at the end of the article. All dependencies when installing from packages are the same. The version 2 configuration file is compatible with version 3, but version 3 adds new options. In the article I will consider squid3 version. It is also worth noting that if you install squid3, it will keep its configuration files in /etc/squid3, as well as the default logs in squid3 are in the directory /var/log/squid3/, not /var/log/squid/, as many log analyzers “like to think.”

The word "was mentioned a bunch of times" caching"And what exactly is this - caching? This a method for storing objects requested from the Internet on a server located closer to the requesting computer than the original one. An Internet object is a file, document, or response to a request to any service provided on the Internet (for example, FTP, HTTP, or gopher). The client requests an Internet object from the proxy server's cache; if the object is not yet cached, then the proxy server receives the object (either from the host specified at the requested URL, or from a parent or neighboring cache) and delivers it to the client.

Squid proxy server operating modes

The Squid proxy server can operate in the following three main modes:

Transparent mode

In this mode, HTTP connections made by clients are redirected to the proxy server without their knowledge or explicit configuration. In this mode, client configuration is not required. Disadvantages of this method: NAT configuration and traffic redirection are required, client authentication does not work, FTP and HTTPS requests are not redirected.

Authentication mode

To work in this mode, clients must be configured to work with a proxy server (the proxy server address must be specified in the connection settings). Client authentication and authorization can be performed via Kerberos, Ldap, NTLM, IP and Radius. It is possible to build interaction with Microsoft Active Directory servers by authenticating clients - domain members using the Kerberos protocol, and subsequent authorization of domain group members using LDAP in transparent mode (the user enters his password only when registering in the domain). For authorized groups, it is possible to use different access control settings and QoS (delay pools).

Reverse proxy

The proxy server caches outgoing data. The Squid reverse proxy receives data from the HTTP server on behalf of the client and transmits it back to the client (for example, to the Internet). This mode allows you to:

Using caching, which reduces the load on HTTP servers;
Load distribution between HTTP servers;
Masking HTTP servers and their characteristics;
Preventing web attacks on servers.

SQUID operating mode diagrams

transparent mode

reverse mode

authentication mode

In the diagrams shown, green arrows indicate proxied traffic flows. The movement of these streams in Linux is most often regulated by the forces and settings of the browser. In addition, very often, the functions of a router and a proxy are performed by one machine.

Installing SQUID

Before installing and configuring squid, you must make sure that the machine on which squid will run has access to the external network and that clients that will use this proxy have access to this machine. Installing the squid proxy server, like other software on Linux, is possible in various ways, described in the article. I'll cover how to install from a repository in Debian. So, to install squid you need to install the squid3 package, to do this run the following command:

Gw ~ # aptitude install squid3 The following NEW packages will be installed: libltdl7(a) squid-langpack(a) squid3 squid3-common(a) 0 packages updated, 4 new installed, 0 packages marked for removal, and 0 packages not updated. It is necessary to obtain 2,157 kB of archives. After unpacking, 10.3 MB will be occupied. Do you want to continue? y Get:1 http://ftp.ru.debian.org/debian/ squeeze/main libltdl7 i386 2.2.6b-2 Get:2 http://ftp.ru.debian.org/debian/ squeeze/main squid- langpack all 20100628-1 Get:3 http://ftp.ru.debian.org/debian/ squeeze/main squid3-common all 3.1.6-1.2+squeeze2 Get:4 http://ftp.ru.debian.org /debian/ squeeze/main squid3 i386 3.1.6-1.2+squeeze2 Received 2,157 kB in 9s (238 kB/s) Selecting the previously unselected package libltdl7. (Reading the database...there are currently 41133 files and directories installed.) The libltdl7 package is unpacked (from the file.../libltdl7_2.2.6b-2_i386.deb)... Selecting the previously unselected squid-langpack package. The squid-langpack package is unpacked (from the file.../squid-langpack_20100628-1_all.deb)... Selecting the previously unselected squid3-common package. The squid3-common package is unpacked (from the file.../squid3-common_3.1.6-1.2+squeeze2_all.deb)... Selecting the previously unselected squid3 package. The squid3 package is unpacked (from the file.../squid3_3.1.6-1.2+squeeze2_i386.deb)... Triggers for man-db are processed... The libltdl7 package (2.2.6b-2) is configured... The squid-langpack package is configured (20100628-1) ... Configuring the squid3-common package (3.1.6-1.2+squeeze2) ... Configuring the squid3 package (3.1.6-1.2+squeeze2) ... Creating Squid HTTP proxy 3.x spool directory structure 2012/02/15 21:29:41| Creating Swap Directories Restarting Squid HTTP Proxy 3.x: squid3Creating Squid HTTP Proxy 3.x cache structure ... (warning). 2012/02/15 21:29:43| Creating Swap Directories.

As you can see, when installing the package, there was an attempt to create cache directory, but because it is not configured, then a warning appears. Also, squid added to startup, launched and accepting connections on all interfaces. But because it is not configured, access to Internet pages through the server is limited. Squid config located in /etc/squid3/squid.conf and consists of more than 5.5 thousand lines and its syntax is practically no different from the config of any other service. It’s not worth rushing to change some settings right away. Then you won’t be able to rake it. Let's look at the config that is offered to us by default without comments and empty lines:

Gw ~ # grep -v ^# /etc/squid3/squid.conf | grep -v ^$ acl manager proto cache_object acl localhost src 127.0.0.1/32::1 acl to_localhost dst 127.0.0.0/8 0.0.0.0/32::1 acl SSL_ports port 443 acl Safe_ports port 80 # http acl Safe_ports port 21 # ftp acl Safe_ports port 443 # https acl Safe_ports port 70 # gopher acl Safe_ports port 210 # wais acl Safe_ports port 1025-65535 # unregistered ports acl Safe_ports port 280 # http-mgmt acl Safe_ports port 488 # gss-http acl Safe_ports port 591 # filemaker acl Safe_ports port 777 # multiling http acl CONNECT method CONNECT http_access allow manager localhost http_access deny manager http_access deny !Safe_ports http_access deny CONNECT !SSL_ports http_access allow localhost http_access deny all http_port 3128 hierarchy_stoplist cgi-bin ? coredump_dir /var/spool/squid3 refresh_pattern ^ftp: 1440 20% 10080 refresh_pattern ^gopher: 1440 0% 1440 refresh_pattern -i (/cgi-bin/|\?) 0 0% 0 refresh_pattern . 0 20% 4320

As you can see, in the default configuration, the proxy server is running and allows requests only from addresses 127.0.0.0/8. You should carefully review the entire list and comment out lines with ports of unnecessary or unused services. A more complete understanding of this config will be after reading the following sections. That. if we launch the lunx console browser pointing to our proxy, we can see the given page:

Gw ~ # # launch the browser specifying the ya.ru page: gw ~ # http_proxy=http://127.0.0.1:3128 lynx ya.ru Searching for "ya.ru" first gw ~ # # in the log we see an access to the specified page : gw ~ # cat /var/log/squid3/access.log 1329527823.407 110 127.0.0.1 TCP_MISS/200 9125 GET http://ya.ru/ - DIRECT/93.158.134.203 text/html

Some parameters in the squid configuration file can be used multiple times (for example, acl). Some parameters, especially those with one value, can only be used once. In this case, when using this parameter 2 or more times, the last value will be used. For example:

Logfile_rotate 10 # Multiple values - the final one will be 5 logfile_rotate 5

squid management

The parameters with which the squid of your distribution was compiled can be viewed using the squid3 -v command. For example, in Debian squeezy squid is built with the parameters given below:

Prefix=/usr - prefix for other keys: --mandir=$(prefix)/share/man - directory for storing man pages --libexecdir=$(prefix)/lib/squid3 - directory with executable modules (including helpers) --sysconfdir=/etc/squid3 - configuration storage directory --with-logdir=/var/log/squid3 - log storage directory and more. etc...

Setting up squid

Description of squid3 settings I'll start with basic settings, which it is advisable to do when setting up any proxy server configuration. The squid config is located in /etc/squid3/squid.conf, this is the main configuration file that contains all the settings. (In Debian and RedHat distributions, parameters from the starting configuration files are also viewed at startup /etc/default/squid3 And /etc/sysconfig/squid3, respectively). Also, I mentioned that there are more than 5 thousand lines and that it’s not worth rushing to configure something without understanding it right away. squid3 config syntax classic: lines with # are comments, parameters are strings " parameter value", it is possible to use . The configuration file is divided into sections for convenience, but it is important to remember that the parameters are parsed "from top to bottom" in order of priority. Also, using include parameter you can connect external configuration files.

By default, the name of the host running Squid is resolved using gethostname(), Depending on your DNS settings, it may sometimes not be able to clearly determine the name that will appear in logs and error outputs.” Generated ... by server.com (squid/3.0.STABLE2)" To correctly record the host name, you need to enter this name (FQDN??) into the parameter:

Visible_hostname myproxy

By default, squid accepts connections on all interfaces. If our server faces the outside world with one of its network interfaces, then it is advisable to limit connections only to the local network interface (for example, 10.0.0.10/24). Responsible for this http_port parameter:

Http_port 10.0.0.10:3128

How these parameters work can be seen in the following listing:

Gw ~ # # check the daemon before setting it up: gw ~ # netstat -antp | grep squ tcp 0 0 0.0.0.0:3128 0.0.0.0:* LISTEN 25816/(squid) gw ~ # # changes made: gw ~ # grep ^http_port /etc/squid3/squid.conf http_port 10.0.0.10:3128 gw ~ # # reread the modified gw config ~ # /etc/init.d/squid3 reload Reloading Squid HTTP Proxy 3.x configuration files. done. gw ~ # # check operation with the changed config: gw ~ # netstat -antp | grep squ tcp 0 0 10.0.0.10:3128 0.0.0.0:* LISTEN 25816/(squid)

As you can see, now the daemon works only on the interface of the specified network. It is also worth noting that new versions of squid (<3.1) поддерживают задание нескольких параметров http_port. При этом, у разных параметров могут быть указанны дополнительные ключи такие как intercept, tproxy, accel и др., например:

Gw ~ # grep ^http_port /etc/squid3/squid.conf http_port 10.0.0.10:3128 http_port 10.0.0.10:3129 tproxy

These parameters set the operating modes of the proxy server. For example, tproxy (old syntax - transparent) sets the mode. These modes are worthy of separate articles and may be considered in the future.

Now you need to configure the client computer and use the Internet. But by default, access is allowed only from the localhost and when trying to access the web, the user will receive an “Access denied” error. The log /var/log/squid3/access.log will contain something like this:

1329649479.831 0 10.0.1.55 TCP_DENIED/403 3923 GET http://ya.ru/ - NONE/- text/html

In order for local network clients to work, it is necessary configure permissions using access control lists.

Setting up squid access

Actually access settings is object description access via acl parameter and then permission or work ban the described acl object using parameter “http_access”. The simplest format for these settings is as follows:

Acl list_name selection_type characteristics of selection_type

Where acl- parameter describing access control list, whose name is given by the value list_name. The name is case sensitive. selection_type specifies the type to which the given below will correspond characteristic_type_of_selection. This characteristic can take such commonly used values as src(from source) - source of the request, dst- destination address, arp- MAC address, srcdomain And dstdomain- domain name of the source and destination, respectively, port- port, proto- protocol, time- time and many others. Accordingly, the value characteristics_of_selection_type will be formed depending on selection_type.

You can specify several acl lines with the same names and selection_types, in which case the acl data will be combined into one list with the logical OR operation. For example:

Acl site dstdomain site.com acl site dstdomain site.org # similar to the entry: acl site dstdomain site.com site.org

In words, it sounds like this: the access list named site owns all requests sent to site.com OR site.org. In addition, rescue_names are case sensitive, meaning acl site and acl Site are 2 different access lists.

Once the access lists have been generated, using http_access parameter allow or deny access to the specified ACL. The general call format is:

Http_access allow|deny [!]list_name

Where, http_access- parameter specifying the subsequent permission rule ( allow) or prohibitions ( deny) access specified below list_name. However, the optional exclamation mark inverts the meaning of the list name. That is, with an exclamation mark the meaning list_name will sound like everyone except those who belong to this list. In addition, you can specify several lists separated by a space, then access will be allowed if it belongs to all the specified lists. In this case, all allowing rules must be specified before ALL prohibiting rules:

Http_access deny all

A reasonable question may arise: why set this rule if, for example, we allow access to the squid only to selected acls? After all, the rest who do not fall into this acl “pass by”... It’s simple. By default, squid uses the allow/deny rule that is the opposite of the latter. For example:

# we have a single allowing rule for a certain user acl: http_access allow user # if, when accessing squid, the client is not included in this acl, then the deny action will be applied to it. # And if we have two rules http_access allow user http_access deny user2 # and the client is not a member of either acl user or acl user2, then allow will be applied to it. # That is, the opposite action to the last one http_access deny user2

This, as they say, is the basics. Let's look at a simple example. Suppose we have 2 networks 10.0.1.0/24 and 10.0.0.0/24, as well as a host 10.0.4.1, which need to be allowed access to the Internet. To allow access, you need to create a description of a new access list in the "ACCESS CONTROL" section of the squid.conf file:

Acl lan src 10.0.1.0/24 10.0.0.0/24 acl lan src 10.0.4.1

For greater convenience, you can set these rules in a separate file, specifying the path to it in the location characteristics_of_selection_type. Here:

Gw ~ # # let's create a separate directory for storing access lists gw ~ # mkdir /etc/squid3/acls/ gw ~ # # put our subnets and hosts in a separate file gw ~ # vim /etc/squid3/acls/lan.acl gw ~ # cat /etc/squid3/acls/lan.acl 10.0.1.0/24 10.0.0.0/24 10.0.4.1 gw ~ # # describe the created file in the config (the path must be enclosed in quotes) gw ~ # grep lan.acl /etc /squid3/squid.conf acl lan src "/etc/squid3/acls/lan.acl"

Let's allow the created lan access list access to the Internet and tell the squid to re-read the configuration file:

Gw ~ # grep lan /etc/squid3/squid.conf | grep acce http_access allow lan gw ~ # service squid3 reload Reloading Squid HTTP Proxy 3.x configuration files. done.

To summarize this section in a nutshell, we can say that acl identifies a Web request, and http_access allows or denies an identified request. Now our local clients are happy to use the Internet after setting up their browser!

Configuring squid cache settings

An important point in setting up squid is setting caching parameters in squid. The cache location is set parameter cache_dir in squid.conf. The parameter format is as follows:

Cache_dir type path size L1 L2

Where, type- this is a cache formation algorithm, maybe: ufs (unix file system), aufs (async ufs), diskd(external processes to avoid squid blocking on disk I/O). Recommended to use ufs, although some praise aufs. Path- specifies the location of the cache in the file system (must exist and have write access rights for the user under which squid is running - usually a proxy). Size- sets the maximum size after which the cache will begin to be cleared. There are many holivars on the Internet for this parameter. The ideal cache size is from 2 to 10 GB depending on the number of clients. Approximately 1 GB cache for every 100 thousand requests/day. I stick to 5 GB. In Squid, each cached object is located in a separate file; the files themselves are not dumped in one place, but a two-level directory hierarchy is used. The number of directories of levels 1 and 2 and determine the parameters L1 and L2. These values can be left as default. But to help you navigate the situation, I’ll give you a quote from bog.pp.ru:

The experiment showed that with a cache of 700 MB, only 2 first-level directories are used. That is, for a standard cache directory structure, a million objects (9 GB) “comfortably” fit into it; if there are more of them, then you need to increase the number of top-level directories

You can use several cache_dir. This has a positive effect on performance, especially if the cache is placed on different disks. You can speed up the cache even more by placing the cache in tmpfs. For each parameter cache_dir possible in options section define the read-only (read-only) and max-size (maximum object size) parameter.

The maximum size of an object in the cache is determined by the maximum_object_size parameter, the default value is 4 MB. I increased this value to 60 MB, because... Employees on the local network often have to download files of the same type up to the specified size:

Maximum_object_size 61440 KB

Likewise? there is also minimum_object_size parameter responsible for the minimum size of an object; by default its value is “0”, that is, disabled. I recommend increasing the value of this parameter to 2-3 KB, which will reduce the disk load when searching for small objects.

RAM capacity, used by the squid is specified in cache_mem parameter, default value is 256 MB (in version 3.1). I left this value as default. You should change this value only if the squid asks you to do so in the logs. After these changes, you need to restart the squid, and the directory structure will be created:

Gw ~ # service squid3 start Starting Squid HTTP Proxy 3.x: squid3Creating Squid HTTP Proxy 3.x cache structure ... (warning). 2012/02/19 22:58:21| Creating Swap Directories 2012/02/19 22:58:21| /var/spool/squid3 exists 2012/02/19 22:58:21| Making directories in /var/spool/squid3/00 2012/02/19 22:58:21| Making directories in /var/spool/squid3/01 2012/02/19 22:58:21| Making directories in /var/spool/squid3/02 2012/02/19 22:58:21| Making directories in /var/spool/squid3/03 2012/02/19 22:58:21| Making directories in /var/spool/squid3/04 2012/02/19 22:58:21| Making directories in /var/spool/squid3/05 2012/02/19 22:58:21| Making directories in /var/spool/squid3/06 2012/02/19 22:58:21| Making directories in /var/spool/squid3/07 2012/02/19 22:58:21| Making directories in /var/spool/squid3/08 2012/02/19 22:58:21| Making directories in /var/spool/squid3/09 2012/02/19 22:58:21| Making directories in /var/spool/squid3/0A 2012/02/19 22:58:21| Making directories in /var/spool/squid3/0B 2012/02/19 22:58:21| Making directories in /var/spool/squid3/0C 2012/02/19 22:58:21| Making directories in /var/spool/squid3/0D 2012/02/19 22:58:21| Making directories in /var/spool/squid3/0E 2012/02/19 22:58:21| Making directories in /var/spool/squid3/0F .

Many interesting questions and answers to them on the use of cache and memory by squid are described. With this, we can consider the standard solution for setting up a proxy server complete.

Example of setting up a transparent squid proxy

What is transparent proxy? This is the operating mode of the proxy server when the client not configurable to work through a proxy and sends requests to the network via the HTTP protocol, as if the browser client were working directly with the web server. At the same time, outgoing HTTP requests are sent to the port on which the proxy is running. The proxy server, in turn, converts HTTP requests into proxy protocol requests and sends responses to the client, like a web server. That. Interaction with the proxy server occurs transparently for the client.

It is important to understand and know! This method supports HTTP protocol only, and does not support gopher, FTP or other proxying. And also, Squid cannot work in transparent mode and authentication mode at the same time.

To configure transparent mode, you must:

1. Set transparent mode in the proxy settings. This is done in http_port parameter, For example:

Http_port ip:port transparent

2. Wrap users the corresponding rule to the desired port using iptables:

Iptables -t nat -A PREROUTING -i incoming_interface_name -s local_network_subnet -p tcp --dport 80 -j REDIRECT --to-port squid_port, example: iptables -t nat -A PREROUTING -i eth1 -s 10.0.0.0/24 - p tcp --dport 80 -j REDIRECT --to-port 3128

All. Can be enjoyed by wrapped and unsuspecting users on our proxy server.

Troubleshooting

First of all, diagnostics of squid operation is browsing magazines located in /var/log/squid3. Most problems are solved this way. If this did not help solve the problem, then switch the daemon to debug mode with the command squid3 -k debug the problem will be easier to find. Actually, what is a squid log? Log files contain various information about Squid load and performance. In addition to access information, /pre, system errors and information about resource consumption, such as memory or disk space, are also written to log.

Squid log file format is a string of values separated by one or more spaces:

Time.ms response_time ip_src Squid_req_status/HTTP_status byte_snd method URL user squid_her_status/ip_dst MIME

time- time in unix format (Number of seconds from 00:00 1970.01.01)
ms- milliseconds accurate to 3 digits
response_time- response time, milliseconds
ip_src- source IP address
Squid_req_status- squid request status (for example, TCP_HIT for previously cached objects, TCP_MISS if the requested object is not taken from the local cache, UDP_HIT and UDP_MISS the same for brother requests)
HTTP_status- http protocol status (200 for successful, 000 for UDP requests, 403 for redirects, 500 for errors)
byte_snd- transmitted, byte in response including HTTP header
method- request method GET or POST
URL- requested url
user- name of the authorized user
squid_her_status- squid hierarchy status - Result of queries to brother/parent caches
ip_dst- IP address of the requested node
MIME- mime-type

Let's look at an example:

1329732295.053 374 10.0.1.55 TCP_MISS/200 1475 GET http://www.youtube.com/live_comments? - DIRECT/173.194.69.91 text/xml

As you can see, the request was made at 1329732295.053, the response from the remote server was 374 ms, the host that requested the page has IP 10.0.1.55, the requested object was not transferred from the local cache (TCP_MISS), the server response code was 200, 1475 bytes were transferred to the client using the GET method, the URL http://www.youtube.com/live_comments? was requested, the username was not defined, the object was received directly from the server with IP 173.194.69.91, the text was transmitted because mime - text/xml. Here.

Some final points about squid3

In the article, I reviewed the basic principles of the proxy server, as well as the basic settings that allow you to implement a simple caching server, as well as organize the operation of squid in transparent mode. Squid supports several authorization options (via IP, via LDAP, MySQL, NTLM, etc.), the ability to limit channel bandwidth and control access to Internet resources. I will consider the operation of the SQUID with various authorization methods and examples of traffic control in the following articles.

The article is very useful, easy to understand and quite detailed.
Taken for preservation from here http://www.opennet.ru/base/net/squid_inst.txt.html

Keywords: squid proxy acl
From: Zabudkin Lev Miroslavovich
Date: Fri, 14 Jan 2005 15:04:58 +0500 (YEKT)
Subject: Setting up Squid for beginners

Setting up Squid for dummies
(version of article 1.0 dated October 29, 2004)

Zabudkin Lev Miroslavovich,
Russia, Tyumen region,
Nizhnevartovsk,
lead programmer
MU "Library and Information System"
[email protected]
http://zabudkin.com

INTRODUCTION
———-

Many administrators are faced with the problem of smart use
time and channel to access the Internet, think about the possibility
saving time and money, about speed limits for certain types
files or personalities, in the end about saving everything related to
certain aspects of access to the global network.

With the help of this article, I will try to clearly and clearly explain about
settings of the most common proxy server - proxy
Squid servers.

INITIAL SQUID SETTINGS FOR USER ACCESS
—————————————————

We will not go into the process of installing a Squid proxy server, but
Let's move straight to setting it up.

The most basic thing we should do after installation is
allow access to users of our local network. For this purpose they serve
parameters http_port, http_access. In addition, we will create an acl (list
access control) for our local network.

And so, we need http_port because our Squid proxy server
should only serve computers on our local network and be
invisible to the outside world, in order to exclude the possibility of “bad
people" of the external network to use our channel or traffic, and
in case “holes” are discovered in the proxy server code
Squid, take advantage of them.

The http_access parameter is used to allow or deny access
to certain resources, certain addresses, or from certain
addresses, to certain sites, using certain protocols, ports and
everything that is directly specified using Acls (control lists)
access).

Table N 1. Some subnets.

|Address range |Full form |Short form
192.168.0.1-192.168.0.254 192.168.0.0/255.255.255.0 192.168.0.0/24
192.168.20.1-192.168.20.254 192.168.20.0/255.255.255.0 192.168.20.0/24
192.168.0.1-192.168.254.254 192.168.20.0/255.255.0.0 192.168.20.0/16
10.0.0.1-10.254.254.254 10.0.0.0/255.0.0.0 10.0.0.0/8

Let's assume that you have a network with addresses from 192.168.0.1 to 192.168.0.254,
then add a new Acl (see table N1):

acl LocalNet src 192.168.0.0/24

Let's assume that your Squid proxy server is located at
192.168.0.200 on port 3128, then write in the configuration file:

http_port 192.168.0.200:3128

Our next action will be to prohibit the use of our proxy
servers, except by users of our local network:

http_access allow LocalNet
http_access deny all

In this case, the word allow is permission, and the word deny
prohibition, that is, we allow access to the Squid proxy server with
addresses of our local network and deny access to everyone else.

Be careful when specifying http_access, as Squid uses them in
in the order indicated by you.

LEARNING ACL (ACCESS CONTROL LISTS)
————————————-

The access control system in the Squid proxy server is very flexible
and extensive. It consists of elements with values and access lists c
indicating allow or deny.

The Acl format is as follows:

acl name element list

Access list format:

http_access indication acl_name

We'll look at some of the elements that a proxy allows you to use.
Squid server, of course with examples:

* acl name src list

With this element (src) we indicate the source IP address, that is
the client from which the request came to our proxy server.

In the following example, we will allow Vasya Pupkin and the department
programming (Progs) access to our proxy server, and everyone
we will prohibit the rest:

acl Progs src 192.168.0.1-192.168.0.9
acl Pupkin src 192.168.0.10
http_access allow Progs
http_access allow Pupkin
http_access deny all

* acl name dst list

This element (dst) specifies the destination IP address, that is, the IP address
the server that the client of the proxy server wants to access.

In the following example, we will deny Vasya access to the 194.67.0.0/16 subnet (to
for example, it contains the same aport.ru):

acl Net194 dst 194.67.0.0/16
http_access deny Pupkin Net194

* acl name dstdomain list

With this element (dstdomain) we indicate the domain to be accessed
which the client of the proxy server wants to receive.

In the following example, we will deny Vasya access to the warez sites nnm.ru and
kpnemo.ru:

acl SitesWarez dstdomain .nnm.ru .kpnemo.ru
http_access deny Pupkin SitesWarez

If you need to specify the source domain, use
srcdomain.

* acl name [-i] srcdom_regex list
* acl name [-i] dstdom_regex list

These elements differ from srcdomain and dstdomain only in that they
regular expressions are used, which we do not use in this article
we are considering, but we will still give an example:

Acl SitesRegexSex dstdom_regex sex
Acl SitesRegexComNet dstdom_regex \.com$ \.net$
http_access deny Pupkin SitesRegexSex
http_access deny Pupkin SitesRegexComNet

In this example, we have denied access to Vasily Pupkin to all domains,
containing the word sex and to all domains in the .com and .net zones.

The -i switch is designed to ignore the case of characters in regular expressions.

* acl name [-i] url_regex list

With this element (url_regex) we specify the regular pattern
URL expressions.

An example of specifying files with the avi extension starting with the word sex:

acl NoAviFromSex url_regex -i sex.*\.avi$

In case you wish to specify a template only for the URL path, i.e.
excluding protocol and hostname (domain), then use urlpath_regex.

Example for specifying music files:

acl media urlpath_regex -i \.mp3$ \.asf$ \.wma$

* acl name_acl port list

Specifying the destination port number, that is, the port to which the
the client of our proxy server will connect.

As an example, we will prohibit everyone from using the Mirc program through our proxy
server:

Acl Mirc port 6667-6669 7770-7776
http_access deny all Mirc

* acl name_acl proto list

Specifying the transfer protocol.

As an example, we will prohibit the above-mentioned Vasya from using the FTP protocol
through our proxy server:

acl ftpproto proto ftp
http_access deny Pupkin ftpproto

* acl name_acl method list

Specifying the client's http request method (GET, POST).

Let's take a situation where Vasya Pupkin should be prohibited from viewing it
mail on the mail.ru website, but at the same time allow you to walk around the site without
prohibitions, that is, deny Vasya the opportunity to enter his mailbox
via the login form on the site:

acl SiteMailRu dstdomain .mail.ru
acl methodpost method POST
http_access deny Pupkin methodpost SiteMailRu

USER RESTRICTIONS
————————-

Quite often in our country a situation arises that the access channel to
the global Internet network is not enough for all users and arises
the desire to give everyone the maximum, but at the same time not to give the channel
“bend down” because of those who like to upload files.

Squid proxy server tools allow you to achieve this in several ways:

— the first way is to optimize object caching;

- the second is a time limit for certain users, which is not
quite correct;

- the third way is to limit the speed for certain types
files, users and everything that we define through Acl.

TIME LIMITS
———————-

You can limit users by time as follows:

acl name time days hh:mm-HH:MM

Where is the day: M - Monday, T - Tuesday, W - Wednesday, H - Thursday, F -
Friday, A - Saturday, S - Sunday.

In this case, hh:mm must be less than HH:MM, that is, you can specify with
00:00-23:59, but you cannot specify 20:00-09:00.

Let's prohibit the same Vasya from having access to the Internet from 10 to
15 hours every day:

acl TimePupkin time 10:00-15:00
http_access deny Pupkin TimePupkin

If you want to allow Vasya to use the Mirc program from 13 to 14
hours, then we write:

acl TimePupkin time 13:00-14:00
http_access allow Pupkin TimePupkin Mirc
http_access deny Pupkin Mirc

What to do if you need to prohibit or allow on certain days
weeks? Squid also allows you to do this, for example from 13 to 14 V
Monday and Sunday:

acl TimePupkin time MS 13:00-14:00

As you can see, there is nothing complicated about this.

SPEED LIMITS
————————

Speed adjustment in the Squid proxy server is carried out using
pools. A pool is a kind of beer keg in which beer is constantly
pour to the brim, and customers pour into their glasses or other containers
for further internal consumption as needed through their
personal taps.

Pools are regulated using three parameters: delay_class,
delay_parameters, delay_access. The number of pools is specified using
delay_pools parameter.

Pools can be of three classes:

1. The entire flow of beer is limited to one tap (for the entire network).

2. The entire flow of beer is limited to one tap, but the tap is divided
for subcranes (for each IP).
3. The entire flow of beer is limited to one tap, but the tap is divided into
subcranes (on subnetworks), which are also divided into mini cranes (on
each IP).

delay_pools number of_announced_pools
delay_access pool_number action acl_name

The action can be allow or deny. At the same time,
this pool applies to those to whom it is permitted and does not apply to those
to whom it is prohibited. In case allow all is specified and then deny Pupkin,
then this class will still have an effect on Pupkin, because Pupkin's IP address
declared in acl Pupkin, is included in the acl all address list. Have it
in view.

delay_class pool_number pool_class
delay_parameters pool_number parameters

The parameters differ depending on the pool class:

for first class:

delay_parameters 1 byte_for_the entire_network

for second grade:

delay_parameters 1 per_entire_network per_client

for third grade:

delay_parameters 1 per_entire_network per_subnet per_client

For example, we have a 128 Kbps channel (average 15 Kbps) and we
We wish Vasya (Pupkin) to give only 4 KB/sec (one for everything
a small glass), give the programming department (Prog) only 10
KB/sec and only 5 KB/sec for each (only two glasses), all
limit the rest to 2 KB/sec for each and 10 KB/sec for everyone, and
mp3 (media) files are limited to 3 KB per second for all (for the entire barrel
beer tap is such a small one). Then we write:

acl Prog src 192.168.0.1-192.168.0.9
acl Pupkin src 192.168.0.10
acl LocalNet src 192.168.0.0/255.255.255.0
acl media urlpath_regex -i \.mp3$ \.asf$ \.wma$

delay_pools 4
# first let's limit mp3
delay_class 1 1
delay_parameters 1 3000/3000
delay_access 1 allow media
delay_access 1 deny all
# let's limit poor Vasya
delay_class 2 1
delay_parameters 2 4000/4000
delay_access 2 allow Pupkin
delay_access 2 deny all
# limit the programming department
delay_class 3 2
delay_parameters 3 10000/10000 5000/5000
delay_access 3 allow Prog
delay_access 3 deny all
# now let's restrict the rest (second pool class)
delay_class 4 2
delay_parameters 4 10000/10000 2000/2000
delay_access 4 deny media
delay_access 4 deny Pupkin
delay_access 4 deny Prog
delay_access 4 allow LocalNet
delay_access 4 deny all

The question often arises, what is the best way to use such a small
channel so that it is automatically divided between all those who are currently
is the moment loading anything? There is a clear answer to this question -
It is not possible to do this using the Squid proxy server, but still
there are some things you can do:

delay_class 1 2
delay_parameters 1 -1/-1 5000/15000
delay_access 1 allow LocalNet
delay_access 1 deny all

Thus, we allocate the maximum to our entire network and subnetworks.
channel (-1 means unlimited), and we give each user
maximum speed of 5 Kbps after it downloads at maximum
speed for the first 15 KB of the document.

This way the client will not eat the entire channel, but will quickly receive
first 15 KB.

OPTIMIZING OBJECT CACHING IN SQUID
——————————————

There are many types of files that are not updated enough
often to allow the proxy server to respond to headers from
web servers that this object is not subject to caching or it was
surprisingly just changed. This is a fairly common situation.

To resolve such situations, the refresh_pattern parameter in the file is called
Squid proxy server settings, but completely with formulas, etc. we him
we will not consider.

refresh_pattern [-i] string MINV percentage MAXV parameters

This parameter is used to determine the age of an object
(read the file) in the cache, whether it should be updated or not.

MINV (minimum time) - time in minutes when an object available in
The cache is considered fresh.

MAXV (maximum time) - the maximum time in minutes when the object
considered fresh.

Parameters are one or more of the following:

— override-expire — ignore information about the expiration of an object’s freshness
and use MINV.

- override-lastmod - ignore information about the file modification date and
use MINV.

- reload-into-ims - instead of asking the client request "do not cache"
documents" (no-cache) send a request "If modified from"
(If-Modified-Since)

— ignore-reload — ignore client requests “do not cache documents”
(no-cache) or “reload the document” (reload).

And so, we come to the most important thing. Well, what types of files are less common?
Are everyone updated? As a rule, these are various music files and
pictures.

Let's set the freshness of objects, for this purpose for pictures and music files
let's indicate, let's say for example, as many as 30 days (43200 minutes):

refresh_pattern -i \.gif$ 43200 100% 43200 override-lastmod override-expire
refresh_pattern -i \.png$ 43200 100% 43200 override-lastmod override-expire
refresh_pattern -i \.jpg$ 43200 100% 43200 override-lastmod override-expire
refresh_pattern -i \.jpeg$ 43200 100% 43200 override-lastmod override-expire
refresh_pattern -i \.pdf$ 43200 100% 43200 override-lastmod override-expire
refresh_pattern -i \.zip$ 43200 100% 43200 override-lastmod override-expire
refresh_pattern -i \.tar$ 43200 100% 43200 override-lastmod override-expire
refresh_pattern -i \.gz$ 43200 100% 43200 override-lastmod override-expire
refresh_pattern -i \.tgz$ 43200 100% 43200 override-lastmod override-expire
refresh_pattern -i \.exe$ 43200 100% 43200 override-lastmod override-expire
refresh_pattern -i \.prz$ 43200 100% 43200 override-lastmod override-expire
refresh_pattern -i \.ppt$ 43200 100% 43200 override-lastmod override-expire
refresh_pattern -i \.inf$ 43200 100% 43200 override-lastmod override-expire
refresh_pattern -i \.swf$ 43200 100% 43200 override-lastmod override-expire
refresh_pattern -i \.mid$ 43200 100% 43200 override-lastmod override-expire
refresh_pattern -i \.wav$ 43200 100% 43200 override-lastmod override-expire
refresh_pattern -i \.mp3$ 43200 100% 43200 override-lastmod override-expire

The settings shown above are just an example, to make it clear
essence.

Now you can check the effectiveness of your proxy server, it is already
will definitely increase.

CONCLUSION
———-

The Squid proxy server is not just a common proxy
server, there are others. But as statistics show, the majority
use this particular proxy server, but many still
Beginners have problems setting up.

With this article I tried to reveal at least a little for the vast
Mass some functions of the Squid proxy server.

Posted in: ,
Tagged: , , Now there will be a limit on download speed for different groups of people. Well, are you ready? :) Let's go...

Start. Creating an ACL

Create ACL groups:
acl mp3_deny src "/etc/squid/lists/mp3_deny_users"
acl super_users src "/etc/squid/lists/super_users"
acl deny_all src "/etc/squid/lists/deny_all_users"

So we created lists, or rather three groups of users, whose addresses are contained in the files. Since the IPs were assigned before me, and do not agree on the resolution of what to download and what not, it will be easier to write out their IP in a file than to create ranges, but whatever you want :)
acl mego_super_user src 192.168.0.0-256 # =) just be careful with this range
Example of the contents of a list file
nano "/etc/squid/lists/mp3_deny_users"
192.168.0.213
192.168.0.75
192.168.0.52
195.168.0.254

Now it's time to create a list of denied permissions:
acl mobile urlpath_regex -i (\.thm|\.sis|\.swf|\.jad|\.jar|\.3gp|\.mp4)((\#|\&|\?|\s)(1 )|$)
acl multimedia urlpath_regex -i (\.swf|\.mp3|\.m3u|\.flv|\.wav|\.vqf|\.avi|\.wmv|\.mpeg|\.mp|\.asf| \.mpe|\.dat|\.mpg|\.wma|\.midi|\.aiff|\.au|\.qt|\.ram|\.rm|\.iso|\.raw|\. mov)((\#|\&|\?|\s)(1)|$)
acl archive urlpath_regex -i (\.tar.gz|\.gz|\.tar|\.zip|\.rar|\.cab|\.arj|\.lzh|\.ace|\.7-zip| \.gzip|\.uue|\.bz2|\.iso)((\#|\&|\?|\s)(1)|$)
acl soft urlpath_regex -i (\.exe|\.msi|\.rpm)((\#|\&|\?|\s)(1)|$)
acl mp3 urlpath_regex -i (\.wav|\.mp3|\.mp4)((\#|\&|\?|\s)(1)|$)

That's all, the sixth bucket is of greatest interest:
each of them downloads everything at a speed of 10, if the speed in the subnetwork exceeds, then cutting of the channel will begin,
If they also download from other subnetworks, and there are more than two of them, then the speed starts to slow down...
If you do it on a free server, then when building the squid, be sure to build it with the --enable-delay-pools option!
P.S. I tried very hard to explain everything as clearly as possible. If I helped you figure it out, it means I didn’t write this topic in vain. I'll be very happy. If something is not clear, ask questions, I will definitely answer.
P.S.S. The default squid config helped me write all this, if you start reading it, you will be able to find out a bunch of new things!
P.S.S.S Dear KorP, unfortunately I don’t have time for a domain now, so for now I’m writing what’s in my head
UPD.
reply_body_max_size 1000 allow all file larger than 1 kilobyte does not download

SQUID is a program that receives HTTP/FTP requests from clients and uses them to access Internet resources. The use of a proxy server (squid) makes it possible to use fictitious IP addresses on the internal network (Masquerading), increases the speed of request processing when re-applying (caching), and also provides additional security.

There is no point in installing a proxy on your home machine, since the caching functions are performed by the browser. A proxy server should only be used if there are three or four computers on your network that need Internet access. In this case, the request from the browser to the proxy server is processed faster than from the browser to Internet resources, and thus performance increases. In this case, you can safely set the cache size in client browsers to zero.

SQUID is more than just a proxy server. This is a kind of standard for caching information on the Internet. Due to the ubiquity of SQUID, I paid a lot of attention to its configuration in the book.

The Squid proxy server is formed by several programs, including: the squid server program itself, as well as the dnsserver program - a program for processing DNS requests. When the squid program starts, it first starts a specified number of dnsserver processes, each of which runs independently and can only perform one DNS lookup. This reduces the overall DNS response waiting time.

15.2. Installing SQUID

SQUID can be installed from source or as an RPM package. Installing the SQUID RPM package is very simple - all you need to do is enter the command

rpm –ih squid-2.3.STABLE2-3mdk.i586.rpm

I'm using squid version 2.3. A newer version is available as source code. Sources can be obtained from ftp://ftp.squid.org. To unpack the source codes, run the following commands:

gunzip squid-2.3.STABLE2-3-src.tar.gz

tar xvf squid-2.3.STABLE2-3-src.tar.gz

Now let's proceed directly to the installation:

./configure --prefix=/usr/local/squid

SQUID will be installed in the directory specified by the prefix key - /usr/local/squid. In addition to prefix, you can use the keys presented in table. 15.1.

configure script options Table 15.1

15.3. Setting up SQUID

The SQUID server uses the squid.conf configuration file, which is usually located in the /etc/squid directory (or /usr/local/squid/etc - earlier versions). Open it in any text editor, for example joe/usr/local/squid/etc/squid.conf. Next, perform the following sequence of actions:

1. Specify the proxy provider:

In this case, proxy .isp.ru becomes our “neighbor” (neighbor, peer).

2. Set the amount of memory available to squid and the directory for the cache:

cache_dir /usr/local/squid/cache 1024 16 256

where: 65536 - the amount of RAM in bytes that can be used for the cache;

1024 - the number of megabytes allocated on the disk in the specified directory for the cache. The cached files will be stored in this directory. Needless to say, if you have several hard drives, then the cache should be placed on the fastest one.

3. Specify the hosts that are allowed to access the proxy server:

acl allowed hosts src 192.168.1.0/255.255.255.0

acl localhost src 127.0.0.1/255.255.255.255

4. Specify allowed SSL ports:

http_access deny CONNECT !SSL_ports

and deny access to everyone except those who can:

http_access allow allowed_hosts

6. Register the users who are allowed to use squid (in the example under consideration these are den, admin, developer):

acl allowed_users user den admin developer

http_access allow allowed_users

The maxium_object_size and maxium_object tags set restrictions on the size of transferred objects.

Below is an example of denying access to any URL that matches the games pattern and allowing access to all others:

15.4. Launching SQUID

Now that you have done the basic setup of SQUID, you need to run it: /usr/local/squid/bin/squid –z

The –z option is required to create (nullify) the directory containing the cache. Usually this parameter is needed only at the first start. Some other useful SQUID parameters are presented in Table. 15.2.

SQUID parameters Table 15.2

Parameter	Description
-a port	Sets the port for incoming HTTP requests
-d	Enables the mode of outputting debugging information to the standard error stream (on stderr)
-f file	Specifies a configuration file
-h	Provides help information
-k reconfigure	Sends a HUP signal
-k shutdown	Shutting down the proxy server
-k kill	Completing without closing logs
-u port	Sets the port for incoming ICP requests
-s	Enables logging using syslog
-v	Displays information about the SQUID version
-D	Don't do a DNS test at startup
-N	Don't become a daemon (background process)
-Y	Faster recovery from failures

15.5. squid.conf file format

The squid.conf file specifies various proxy server configuration parameters. Let's look at them all in order.

15.5.1. Network settings

Port for client requests (see Fig. 15.1):

Rice. 15.1. Proxy settings

If there are no “neighbors” (peers), then set icp_port 0

Port for communicating with neighbors - ICP - via TCP. When using this option, you must set the --enable-htcp switch when installing htcp_port 4827.

The next parameter specifies at which address incoming packets should be received if the host has multiple interfaces. In version 2.3 this parameter is not present:

When sending information, the specified address will be used as the source:

The same, but for ICP:

(similar to ICP)

The same, but for ICP (upon reception):

By default, this mode is enabled, but if the proxy server is located behind a bastion (firewall), then the passive_ftp parameter must be disabled:

15.5.2. Neighbor options

Neighbors are described using lines of the following format:

cache_peer hostname type proxy-port icp-port options

where: hostname - neighbor's name;

type - type of neighbor: parent - senior, sibling - same level;

proxy-port - proxy server port;

icp-port - ICP port;

options - parameters.

In this case, each neighbor is written on a separate line.

Parent - if the request is not in the local cache, it is redirected to parent; that one, if the request is not in its cache, forwards it further, etc. Returns a ready response to the subordinate. If squid receives TCP_DENIED from parent, then the resource will be accessed directly.

Sibling - if the request is not in the local cache, the request is redirected to sibling; if there is no request in it, returns a message about this, no additional actions are taken.

15.5.3. Cache management

When this cache full level (as a percentage) is reached, an accelerated process of deleting old objects begins.

The removal process stops when this level is reached.

The maximum size of a cached object.

Smaller files are not saved.

15.5.4. Logging

The SQUID logging modes are listed below, along with their associated logs. If you don't need a particular log, set none instead of the file name.

cache_access_log /usr/local/squid/logs/access.log

Every request to SQUID is logged. The log is named /usr/local/squid/logs/access.log.

cache_log /usr/local/squid/logs/cache.log

Process starts are logged. The log is called /usr/local /squid/logs/cache.log.

cache_store_log /usr/local/squid/logs/store.log

Writes of objects to the cache are logged. The log is named /usr/local/squid/logs/store.log.

15.5.5. External program parameters

The email specified here will be used instead of a password for anonymous access to FTP servers.

dns_nameservers list of IP addresses

The value of this parameter is used instead of the list of DNS servers that is defined in the /etc/resolv.conf file; default is none.

cache_dns_program /usr/local/squid/bin/dnsserver

This parameter specifies the program for resolving IP addresses into names (DNS server).

Allows authentication of clients making requests. In this case, the ACL proxy_auth must be defined.

authenticate_program /usr/local/squid/bin/ncsa_auth /usr/local/squid/etc/passwd

Traditional authentication program. Defined in ../auth_modules/NCSA.

15.5.6. ACLs

ACL (Access Control Lists) - access control lists. Quite often there is a need to group parameters of the same type into a single whole for their subsequent processing. To effectively solve this problem, access control lists (ACLs) are used. For example:

This entry means that a list of SSL_ports of type port is being created. The elements of the list are port numbers 443 and 563.

You can add a new element to an existing list (add parameter) like this:

You can remove an unnecessary element using the del parameter: acl del SSL_ports 999

The ren parameter (from rename) allows you to rename the list:

acl ren SSL_ports Allowed_ports

The flush parameter allows you to delete all lists along with their contents:

The ACL standard requires that the list name must be preceded by a $ character. In other words, all of the above examples must, by and large, be wrong. For example, to create a list you need to use the entry:

However, most filters, such as SQUID, ignore this requirement, and you can specify list names without the dollar sign.

So, ACL is an access list definition. Has the following format:

acl name type string

where: type is the type of the object;

string is a regular expression.

You can use a list:

acl name type filename

The parameters are listed one parameter per line. The types that can be used when constructing ACLs are listed in Table. 15.3.

ACL types Table 15.3

Type	Type description
Src ip-address/netmask	Specifies the clients IP address
Src addr1-addr2/netmask	Specifies a range of addresses
Dst ip-address/netmask	Specifies host URLs
Time	Time where day is one letter from SMTWHFA
Port	List of ports
Port port1–port2	Port range
Proto	Protocol - HTTP or FTP
Method	Method - GET or POST
Browser [-i] regexp	The User-Agent header is compared

[-i] - letter case is ignored.

15.5.7. Access Options

http_access allow|deny aclname

Allow proxy access via HTTP.

icp_access allow | deny aclname

Allow access to proxy via ICP.

miss_access allow | deny aclname

Allow to receive a MISS response from you.

cache_peer_access cache-host allow|deny aclname

Limit requests to this neighbor - extension for cache_peer_domain.

proxy_auth_realm Squid proxy-caching web server

A string of text that will be displayed on the client screen when prompted for a cache access name/password.

15.5.8. Administration Settings

This parameter specifies the email address to which a letter will be sent if squid stops functioning.

When running SQUID as root, change the UID to the one specified in the cache_effective_user parameter.

When running SQUID as root, change the GID to the one specified in the cache_effective_group parameter.

visible_hostname host_name

This name will be mentioned in error messages.

This parameter specifies a list of synonyms for the hostname.

15.6. Opt out of advertising. Banner filter

Don't want to waste extra time loading advertising banners? Me too. Fortunately, SQUID makes it quite easy to solve this problem. Just paste the following lines into your /usr/local /etc/squid/squid.conf file:

acl good_url url_regex "/usr/local/etc/squid/acl/good_url"

acl bad_urlpath urlpath_regex "/usr/local/etc/squid/acl/bad_urlpath"

acl bad_url url_regex "/usr/local/etc/squid/acl/bad_url"

http_access deny bad_urlpath !good_url

http_access deny bad_url !good_url

Accordingly, you will need to create three files: good_url, bad_url_path and bad_url. The bad_url file should contain “bad” URLs, for example:

^http://.*-ad.flycast.com/server/img/

^http://1000-stars-ru/cgi-bin/1000-cgi

^ http://1000.stars.ru/cgi-bin/1000.cgi

And in the bad_url_path file - the “bad” path, for example:

Usually such names have banners.

Examples of good_url, bad_url_path and bad_url files can be found on my home page - http://dkws.narod.ru

15.7. Channel Splitting

Let's say you need to configure a proxy server so that one group of computers can work at one speed, and another at another. This may be required, for example, to distinguish between users who use the channel for work and users who use the channel resources for home purposes. Naturally, channel capacity is more important for the former than for the latter. Using a SQUID proxy, you can split the channel.

To begin, in the configuration file, indicate how many pools, that is, user groups, you will have:

Next, define the pool classes. There are three classes in total:

1. One channel bandwidth limit is used for all.

2. One general limit and 255 individual limits for each class C network node.

3. Each class B subnet will have its own limit and a separate limit for each node.

Add the following directives to the squid.conf file:

delay_class 1 1 # defines the first class 1 pool for home users

delay_class 2 2 # defines a second class 2 pool for employees

Now define the nodes that will belong to the pools:

Then specify the restrictions:

delay_parameters 1 14400/14400

delay_parameters 2 33600/33600 16800/33600

As I noted above, for a class 1 pool, one limit is used for all computers in the pool - 14400 bytes. The first number sets the fill rate for the entire pool (bytes/second). The second is the maximum limit.

For a class 2 pool, accordingly, restrictions are used for the entire subnet and separately for each user. If we had a class 3 pool defined, then the restrictions for it would look something like this:

delay_parameters 3 128000/128000 64000/128000 12800/64000

The first two numbers set the fill rate and the maximum limit for everyone, respectively. The next pair of numbers determines the per-subnet fill rate and maximum limit, and the third determines the fill rate and maximum limit for an individual user.

15.8. Traffic accounting programs

To monitor the operation of SQUID and generally to account for traffic, you can use the following programs:

sqmgrlog - http://www.ineparnet.com.br/orso/index.html

mrtg - http://www.switch.ch/misc/leinen/snmp/perl/

iptraf - http://dkws.narod.ru/linux/soft/iptraf-2.4.0.tar.gz

bandmin - http://www.bandmin.org

webalizer (Apache analysis) - http://www.mrunix.net/webalizer/

These programs come with fairly readable documentation, so I won’t go into detail about their use. The MRTG program is described in paragraph 8.5.

15.9. Setting up clients

After you have configured the proxy server, let's move on to configuring the clients, that is, the user's browsers. I have no doubt that you know how to configure this or that browser, I will just remind you of the configuration procedure for some common browsers.

Internet Explorer 5

Menu Tools→Internet Options→Connection tab→Network settings. In the window that appears, set the necessary parameters, that is, the name of the proxy server and its port (see Fig. 15.2).

Rice. 15.2. Setting up Internet Explorer

Netscape Communicator

Menu Edit→Preferences→Advanced→Proxies→Manual Proxy Configuration→View (see Figure 15.3).

Rice. 15.3. Setting up Netscape Communicator

Konqueror

Menu Settings→Settings→Proxies (see Fig. 15.4).

Rice. 15.4. Setting up Konqueror