Importing IP2Location data into Cassandra and querying with PHP (IPv6)

Intro

The guide will demonstrate how to import IP2Location IPv6 data (DB26 IPv6) in CSV form into Apache Cassandra and then query the data in a PHP page.

First of all, you will need to download the IP2Location IP geolocation DB26 IPv6 CSV file.

Download commercial version at https://ip2location.com/download?code=DB26IPV6

Extract out the IPV6-COUNTRY-REGION-CITY-LATITUDE-LONGITUDE-ZIPCODE-TIMEZONE-ISP-DOMAIN-NETSPEED-AREACODE-WEATHER-MOBILE-ELEVATION-USAGETYPE-ADDRESSTYPE-CATEGORY-DISTRICT-ASN.CSV file from the downloaded zipped file and store in the /mydata folder (our example, yours may differ).

Important Note

We will not cover the installation of Cassandra or PHP in this guide. We will assume you have already setup Cassandra and PHP on the localhost and are using PHP via Apache (also on the localhost). For this example, we are using a Debian machine.

You will also need to install the PHP Cassandra driver from https://pecl.php.net/package/cassandra

Pre-process the CSV data

Before we import the CSV data, we have to insert a dummy column into the data for the partition key. As we will be performing an ordered search, all of the rows will have the same partition key.

In Bash, run the following command to prefix every row in the CSV file with the dummy column and output the results into a new CSV file.

cat /mydata/IPV6-COUNTRY-REGION-CITY-LATITUDE-LONGITUDE-ZIPCODE-TIMEZONE-ISP-DOMAIN-NETSPEED-AREACODE-WEATHER-MOBILE-ELEVATION-USAGETYPE-ADDRESSTYPE-CATEGORY-DISTRICT-ASN.CSV | awk 'BEGIN { FS=OFS="\",\""; } { $1 = substr($1, 2); $1 = sprintf("%40s", $1); gsub(/ /, "0", $1); $2 = sprintf("%40s", $2); gsub(/ /, "0", $2); printf("\"db26ipv6\",\""); print; }' > /mydata/IPV6-COUNTRY-REGION-CITY-LATITUDE-LONGITUDE-ZIPCODE-TIMEZONE-ISP-DOMAIN-NETSPEED-AREACODE-WEATHER-MOBILE-ELEVATION-USAGETYPE-ADDRESSTYPE-CATEGORY-DISTRICT-ASN.CSV2

Importing the CSV data into Cassandra

In the cqlsh, run the following command to create the keyspace (equivalent of a database).

CREATE KEYSPACE IF NOT EXISTS ip2location WITH replication = {'class': 'SimpleStrategy', 'replication_factor': 3};

After creating the keyspace, you need to select it by running the below command.

USE ip2location;

Next, run the following command to create the table.

DROP TABLE IF EXISTS ip2location_db26ipv6;

CREATE TABLE IF NOT EXISTS ip2location_db26ipv6 (
   dummy varchar,
   ip_from varchar,
   ip_to varchar,
   country_code varchar,
   country_name varchar,
   region_name varchar,
   city_name varchar,
   latitude float,
   longitude float,
   zip_code varchar,
   time_zone varchar,
   isp varchar,
   domain varchar,
   net_speed varchar,
   idd_code varchar,
   area_code varchar,
   weather_station_code varchar,
   weather_station_name varchar,
   mcc varchar,
   mnc varchar,
   mobile_brand varchar,
   elevation int,
   usage_type varchar,
   address_type varchar,
   category varchar,
   district varchar,
   asn varchar,
   as varchar,
   PRIMARY KEY (dummy, ip_to)
)
WITH CLUSTERING ORDER BY (ip_to ASC);

Now that we have a table, we will commence the import of data from our CSV file into the table.

COPY ip2location_db26ipv6 (dummy, ip_from, ip_to, country_code, country_name, region_name, city_name, latitude, longitude, zip_code, time_zone, isp, domain, net_speed, idd_code, area_code, weather_station_code, weather_station_name, mcc, mnc, mobile_brand, elevation, usage_type, address_type, category, district, asn, as)
FROM '/mydata/IPV6-COUNTRY-REGION-CITY-LATITUDE-LONGITUDE-ZIPCODE-TIMEZONE-ISP-DOMAIN-NETSPEED-AREACODE-WEATHER-MOBILE-ELEVATION-USAGETYPE-ADDRESSTYPE-CATEGORY-DISTRICT-ASN.CSV2';

Querying the data in PHP

Now, create a PHP file called test.php in your website.

Paste the following PHP code into it and then run it in the browser:

<?php
$ip = '8.8.8.8';

function ip62long($ipv6) {
	return (string) gmp_import(inet_pton($ipv6));
}

function queryIP2Location($myip) {
	$keyspace  = 'ip2location';
	$cluster   = Cassandra::cluster()->build(); // localhost
	$session   = $cluster->connect($keyspace);
	
	$padzero = 40; // need to pad the ip numbers because Cassandra is comparing as strings, not numbers
	
	// convert IP address to IP number
	if (filter_var($myip, FILTER_VALIDATE_IP, FILTER_FLAG_IPV4)) {
		$myip = '::FFFF:' . $myip;
	}
	if (filter_var($myip, FILTER_VALIDATE_IP, FILTER_FLAG_IPV6)) {
		$ipnum = ip62long($myip);
	}
	
	// pad ipnum to 40 digits with zeroes in front so we can do string comparison
	$myipnum = str_pad($ipnum, $padzero, '0', STR_PAD_LEFT);
	
	$statement = new Cassandra\SimpleStatement('SELECT * FROM ip2location_db26ipv6 WHERE dummy = \'db26ipv6\' AND ip_to >= \'' . $myipnum . '\' ORDER BY ip_to LIMIT 1');
	
	$future = $session->executeAsync($statement);
	$result = $future->get();
	
	if ($result->count() == 0)
		die('No record found' . "<br>\n");
	
	return $result[0];
}

$myresult = queryIP2Location($ip);

echo 'country_code: ' . $myresult["country_code"] . "<br>\n";
echo 'country_name: ' . $myresult["country_name"] . "<br>\n";
echo 'region_name: ' . $myresult["region_name"] . "<br>\n";
echo 'city_name: ' . $myresult["city_name"] . "<br>\n";
echo 'latitude: ' . $myresult["latitude"] . "<br>\n";
echo 'longitude: ' . $myresult["longitude"] . "<br>\n";
echo 'zip_code: ' . $myresult["zip_code"] . "<br>\n";
echo 'time_zone: ' . $myresult["time_zone"] . "<br>\n";
echo 'isp: ' . $myresult["isp"] . "<br>\n";
echo 'domain: ' . $myresult["domain"] . "<br>\n";
echo 'net_speed: ' . $myresult["net_speed"] . "<br>\n";
echo 'idd_code: ' . $myresult["idd_code"] . "<br>\n";
echo 'area_code: ' . $myresult["area_code"] . "<br>\n";
echo 'weather_station_code: ' . $myresult["weather_station_code"] . "<br>\n";
echo 'weather_station_name: ' . $myresult["weather_station_name"] . "<br>\n";
echo 'mcc: ' . $myresult["mcc"] . "<br>\n";
echo 'mnc: ' . $myresult["mnc"] . "<br>\n";
echo 'mobile_brand: ' . $myresult["mobile_brand"] . "<br>\n";
echo 'elevation: ' . $myresult["elevation"] . "<br>\n";
echo 'usage_type: ' . $myresult["usage_type"] . "<br>\n";
echo 'address_type: ' . $myresult["address_type"] . "<br>\n";
echo 'category: ' . $myresult["category"] . "<br>\n";
echo 'district: ' . $myresult["district"] . "<br>\n";
echo 'asn: ' . $myresult["asn"] . "<br>\n";
echo 'as: ' . $myresult["as"] . "<br>\n";
?>

Was this article helpful?

Related Articles