Intro
The guide will demonstrate how to import IP2Proxy Proxy Detection data (PX11) for IPv6 in CSV form into Apache Cassandra and then query the data in a PHP page.
First of all, you will need to download the IP2Proxy PX11 IPv6 CSV file.
Download commercial version at https://ip2location.com/download?code=PX11IPV6
Extract out the IP2PROXY-IPV6-PROXYTYPE-COUNTRY-REGION-CITY-ISP-DOMAIN-USAGETYPE-ASN-LASTSEEN-THREAT-RESIDENTIAL-PROVIDER.CSV file from the downloaded zipped file and store in the /mydata folder (our example, yours may differ).
Important Note
We will not cover the installation of Cassandra or PHP in this guide. We will assume you have already setup Cassandra and PHP on the localhost and are using PHP via Apache (also on the localhost). For this example, we are using a Debian machine.
You will also need to install the PHP Cassandra driver from https://pecl.php.net/package/cassandra
Pre-process the CSV data
Before we import the CSV data, we have to insert a dummy column into the data for the partition key. As we will be performing an ordered search, all of the rows will have the same partition key.
In Bash, run the following command to prefix every row in the CSV file with the dummy column and output the results into a new CSV file.
cat /mydata/IP2PROXY-IPV6-PROXYTYPE-COUNTRY-REGION-CITY-ISP-DOMAIN-USAGETYPE-ASN-LASTSEEN-THREAT-RESIDENTIAL-PROVIDER.CSV | awk 'BEGIN { FS=OFS="\",\""; } { $1 = substr($1, 2); $1 = sprintf("%40s", $1); gsub(/ /, "0", $1); $2 = sprintf("%40s", $2); gsub(/ /, "0", $2); printf("\"px11ipv6\",\""); print; }' > /mydata/IP2PROXY-IPV6-PROXYTYPE-COUNTRY-REGION-CITY-ISP-DOMAIN-USAGETYPE-ASN-LASTSEEN-THREAT-RESIDENTIAL-PROVIDER.CSV2
Importing the CSV data into Cassandra
In the cqlsh, run the following command to create the keyspace (equivalent of a database).
CREATE KEYSPACE IF NOT EXISTS ip2proxy WITH replication = {'class': 'SimpleStrategy', 'replication_factor': 3};
After creating the keyspace, you need to select it by running the below command.
USE ip2proxy;
Next, run the following command to create the table.
DROP TABLE IF EXISTS ip2proxy_px11ipv6; CREATE TABLE IF NOT EXISTS ip2proxy_px11ipv6 ( dummy varchar, ip_from varchar, ip_to varchar, proxy_type varchar, country_code varchar, country_name varchar, region_name varchar, city_name varchar, isp varchar, domain varchar, usage_type varchar, asn varchar, as varchar, last_seen varchar, threat varchar, provider varchar, PRIMARY KEY (dummy, ip_to) ) WITH CLUSTERING ORDER BY (ip_to ASC);
Now that we have a table, we will commence the import of data from our CSV file into the table.
COPY ip2proxy_px11ipv6 (dummy, ip_from, ip_to, proxy_type, country_code, country_name, region_name, city_name, isp, domain, usage_type, asn, as, last_seen, threat, provider) FROM '/mydata/IP2PROXY-IPV6-PROXYTYPE-COUNTRY-REGION-CITY-ISP-DOMAIN-USAGETYPE-ASN-LASTSEEN-THREAT-RESIDENTIAL-PROVIDER.CSV2';
Querying the data in PHP
Now, create a PHP file called test.php in your website.
Paste the following PHP code into it and then run it in the browser:
<?php $ip = '8.8.8.8'; function ip62long($ipv6) { return (string) gmp_import(inet_pton($ipv6)); } function queryIP2Proxy($myip) { $keyspace = 'ip2proxy'; $cluster = Cassandra::cluster()->build(); // localhost $session = $cluster->connect($keyspace); $padzero = 40; // need to pad the ip numbers because Cassandra is comparing as strings, not numbers // convert IP address to IP number if (filter_var($myip, FILTER_VALIDATE_IP, FILTER_FLAG_IPV4)) { $myip = '::FFFF:' . $myip; } if (filter_var($myip, FILTER_VALIDATE_IP, FILTER_FLAG_IPV6)) { $ipnum = ip62long($myip); } // pad ipnum to 40 digits with zeroes in front so we can do string comparison $myipnum = str_pad($ipnum, $padzero, '0', STR_PAD_LEFT); $statement = new Cassandra\SimpleStatement('SELECT * FROM ip2proxy_px11ipv6 WHERE dummy = \'px11ipv6\' AND ip_to >= \'' . $myipnum . '\' ORDER BY ip_to LIMIT 1'); $future = $session->executeAsync($statement); $result = $future->get(); if ($result->count() == 0) die('No record found' . "<br>\n"); return $result[0]; } $myresult = queryIP2Proxy($ip); echo 'proxy_type: ' . $myresult['proxy_type'] . "<br>\n"; echo 'country_code: ' . $myresult['country_code'] . "<br>\n"; echo 'country_name: ' . $myresult['country_name'] . "<br>\n"; echo 'region_name: ' . $myresult['region_name'] . "<br>\n"; echo 'city_name: ' . $myresult['city_name'] . "<br>\n"; echo 'isp: ' . $myresult['isp'] . "<br>\n"; echo 'domain: ' . $myresult['domain'] . "<br>\n"; echo 'usage_type: ' . $myresult['usage_type'] . "<br>\n"; echo 'asn: ' . $myresult['asn'] . "<br>\n"; echo 'as: ' . $myresult['as'] . "<br>\n"; echo 'last_seen: ' . $myresult['last_seen'] . "<br>\n"; echo 'threat: ' . $myresult['threat'] . "<br>\n"; echo 'provider: ' . $myresult['provider'] . "<br>\n"; ?>