最近在做app投放的转化归因,几个搜索平台并不一定能获取到muid,于是需要用到ip-ua归因模式
网上粗略搜了一下,发现许多文章ua处理用的uaparser又或者user-agent-utils,随遂找来源码看了看,对手机设备的划分太粗糙了,不符合要求。
准备自己写一个。
User-agent格式User-Agent通常格式:
- 自定义标识 (平台) 引擎版本 浏览器版本号
eg:Mozilla/5.0 (iPhone; CPU iPhone OS 14_8_1 like Mac OS X) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/14.1.2 Mobile/15E148 Safari/604.1
实际上需要用到的就是第一个括号内的平台信息,内部信息用英文半角分号分开
常见格式:
- (iPhone; CPU iPhone OS {os version})
- (Linux; {os version}; {lang};{device name} Build/{core version})
截取ua中第一个括号的内容,依次分析内容。
这里和前端统计了常见流量来源的user-agent:
常见渠道user-agent| 来源 | 预估占比 | eg |
|---|---|---|
| safari | 20% | Mozilla/5.0 (iPhone; CPU iPhone OS 14_8_1 like Mac OS X) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/14.1.2 Mobile/15E148 Safari/604.1 |
| 百度 | 17% | Mozilla/5.0 (Linux; Android 9; V1901A Build/P00610; wv) AppleWebKit/537.36 (KHTML, like Gecko) Version/4.0 Chrome/76.0.3809.89 Mobile Safari/537.36 T7/12.27 SP-engine/2.37.0 baiduboxapp/12.28.5.10 (Baidu; P1 9) NABar/1.0 |
| 13% | Mozilla/5.0 (Linux; U; Android 9; zh-cn; LON-AL00 Build/HUAWEILON-AL00) AppleWebKit/537.36 (KHTML, like Gecko) Version/4.0 Chrome/89.0.4389.72 MQQBrowser/12.1 Mobile Safari/537.36 COVC/045825 | |
| 华为 | 13% | Mozilla/5.0 (Linux; Android 10; HarmonyOS; ELS-AN00; HMSCore 6.2.0.302) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/92.0.4515.105 HuaweiBrowser/12.0.1.300 Mobile Safari/537.36 -----------------f分割-------------Mozilla/5.0 (Linux; Android 10; SEA-AL10) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/70.0.3538.64 HuaweiBrowser/10.0.3.311 Mobile Safari/537.36 |
| 小米 | 10% | Mozilla/5.0 (Linux; U; Android 11; zh-cn; Redmi K30 Build/RKQ1.200826.002) AppleWebKit/537.36 (KHTML, like Gecko) Version/4.0 Chrome/89.0.4389.116 Mobile Safari/537.36 XiaoMi/MiuiBrowser/15.6.8-------------------------Mozilla/5.0 (iPhone; U; CPU iPhone OS 5_1_1 like Mac OS X; en-us) AppleWebKit/534.46 (KHTML, like Gecko) Version/5.1 Mobile/9B206 Safari/7534.48.3 XiaoMi/MiuiBrowser/15.6.8 |
| uc | 10% | Mozilla/5.0 (iPhone; CPU iPhone OS 15_1 like Mac OS X; zh-CN) AppleWebKit/605.1.15 (KHTML, like Gecko) Mobile/19B74 UCBrowser/13.6.4.1594 Mobile AliApp(TUnionSDK/0.1.20.4)------------------------------------------Mozilla/5.0 (Linux; U; Android 11; zh-CN; Mi 10 Build/RKQ1.200826.002) AppleWebKit/537.36 (KHTML, like Gecko) Version/4.0 Chrome/78.0.3904.108 UCBrowser/13.6.7.1148 Mobile Safari/537.36 |
| 其他 | 其他 | 随缘 |
可以看到,不同渠道还是有区别的。
代码也就完成了:
代码
@Slf4j
public class UserAgentUtils {
private List systemInfoStrategy;
private List deviceInfoStrategy;
private static String OS_TYPE = "osType";
private static String OS_VERSION = "osVersion";
private static String OS_DEVICE = "mobileModel";
private static String IOS_OS = "ios";
public UserAgentUtils(List systemInfoStrategy, List deviceInfoStrategy){
this.systemInfoStrategy = systemInfoStrategy;
this.deviceInfoStrategy = deviceInfoStrategy;
}
public boolean compareByUserAgent(String ua1, String ua2, String ip1, String ip2){
return analysisUserAgent(ua1, ip1).equalWith(analysisUserAgent(ua2, ip2));
}
public UserAgentDevice analysisUserAgent(String ua, String ip){
UserAgentDevice agentDevice = new UserAgentDevice();
agentDevice.setIp(ip);
Map systemInfo = handleByStrategy(ua, systemInfoStrategy);
if(systemInfo == null){
log.warn("不可识别的系统,可能是pc端之类没有做正则的类别,ua:{}",ua);
} else {
agentDevice.setMobileSystem(systemInfo.get(OS_TYPE));
agentDevice.setSystemVersion(systemInfo.get(OS_VERSION));
}
if(IOS_OS.equals(agentDevice.getMobileSystem())){
//ios 型号用ios+版本号拼接
agentDevice.setMobileModel(agentDevice.getMobileSystem() + agentDevice.getSystemVersion().split(".")[0]);
return agentDevice;
}
Map osInfo = handleByStrategy(ua, deviceInfoStrategy);
if(osInfo == null){
log.warn("未获取到手机型号,可能是未配置的渠道样式,ua:{}",ua);
} else {
agentDevice.setMobileModel(osInfo.get(OS_DEVICE));
}
return agentDevice;
}
public static Map handleByStrategy(String ua,List strategyList){
Map result = new HashMap<>();
for(StrategyRule rule : strategyList){
for(String regular : rule.getRegularList()){
//匹配正则
Matcher matcher = Pattern.compile(regular,Pattern.CASE_INSENSITIVE).matcher(ua);
if(matcher.find()){
int k = 1;
//对捕获组做处理以适应不同返回
for(StrategyFun f : rule.getFunList()){
result.putAll(f.doAction(matcher.group(k++)));
}
return result;
}
}
}
return null;
}
@Data
public static class StrategyRule{
private List regularList;
private List funList;
}
@Data
public static class StrategyFun{
String name;
String value;
boolean valueFlag = false;
String r1;
String r2;
public Map doAction(String str){
if(valueFlag){
return ImmutableMap.of(name, str);
}
if(value != null){
return ImmutableMap.of(name, value);
}
if(r1 != null){
return ImmutableMap.of(name, str.replaceAll(r1, r2));
}
return null;
}
}
}
通过正则匹配设备,然后返回,对比
@Data
public class UserAgentDevice {
private String mobileSystem;
private String systemVersion;
private String mobileModel;
private String ip;
public boolean equalWith(UserAgentDevice x) {
if(ip == null || !ip.equals(x.getIp())){
return false;
}
int count = 0;
if(mobileModel != null && mobileModel.equals(x.getMobileModel())){
++count;
}
if(mobileSystem != null && mobileSystem.equals(x.getMobileSystem())){
++count;
}
if(systemVersion != null && systemVersion.equals(x.getSystemVersion())){
++count;
}
return count > 1;
}
}
一个测试类,包括写好的正则
public class UserAgentUtilsTest {
private static List uaCase = new ArrayList<>();
private static void uadd(String s){ uaCase.add(s);}
static void init(){
uadd("Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.2; SV1; .NET CLR 1.1.4322; .NET CLR 2.0.50727)");
uadd("Mozilla/5.0 (iPhone; CPU iPhone OS 14_4_2 like Mac OS X) AppleWebKit/605.1.15 (KHTML, like Gecko) Mobile/15E148 swan/2.26.0 swan-baiduboxapp/12.6.5.10 baiduboxapp/12.6.5.10 (Baidu; P2 14.4.2) ");
uadd("Mozilla/5.0 (Linux; Android 8.1.0; DUB-AL00 Build/HUAWEIDUB-AL00; wv) AppleWebKit/537.36 (KHTML, like Gecko) Version/4.0 Chrome/76.0.3809.89 Mobile Safari/537.36 T7/12.9 SP-engine/2.21.0 matrixstyle/0 lite baiduboxapp/5.4.0.10 (Baidu; P1 8.1.0) NABar/1.0");
uadd("Mozilla/5.0 (Linux; Android 9.1.0) AppleWebKit/537.36 (KHTML, like Gecko) Version/4.0 Chrome/76.0.3809.89 Mobile Safari/537.36 T7/12.9 SP-engine/2.21.0 matrixstyle/0 lite baiduboxapp/5.4.0.10 (Baidu; P1 8.1.0) NABar/1.0");
uadd("Mozilla/5.0 (Linux; Android 10.1.0 ;) AppleWebKit/537.36 (KHTML, like Gecko) Version/4.0 Chrome/76.0.3809.89 Mobile Safari/537.36 T7/12.9 SP-engine/2.21.0 matrixstyle/0 lite baiduboxapp/5.4.0.10 (Baidu; P1 8.1.0) NABar/1.0");
uadd("Mozilla/5.0 (Linux; U; Android 10; zh-CN; HLK-AL00 Build/HONORHLK-AL00) AppleWebKit/537.36 (KHTML, like Gecko) Version/4.0 Chrome/69.0.3497.100 UWS/3.22.2.28 Mobile Safari/537.36 UCBS/3.22.2.28_210922181100 ChannelId(0) NebulaSDK/1.8.100112 Nebula AlipayDefined(nt:WIFI,ws:360|0|3.0) AliApp(AP/10.2.36.8000) AlipayClient/10.2.36.8000 Language/zh-Hans useStatusBar/true isConcaveScreen/false Region/CNAriver/1.0.0");
uadd("Mozilla/5.0 (Linux; Android 11; V2055A; wv) AppleWebKit/537.36 (KHTML, like Gecko) Version/4.0 Chrome/87.0.4280.141 Mobile Safari/537.36 VivoBrowser/10.3.18.0");
uadd("Mozilla/5.0 (Linux; U; Android 11; zh-cn; PEGM00 Build/RKQ1.200903.002) AppleWebKit/537.36 (KHTML, like Gecko) Version/4.0 Chrome/70.0.3538.80 Mobile Safari/537.36 HeyTapBrowser/40.7.31.1");
uadd("Mozilla/5.0 (Linux; U; Android 11; zh-CN; Mi 10 Build/RKQ1.200826.002) AppleWebKit/537.36 (KHTML, like Gecko) Version/4.0 Chrome/78.0.3904.108 UCBrowser/13.6.7.1148 Mobile Safari/537.36");
uadd("Mozilla/5.0 (Linux; U; Android 11; zh-cn; Redmi K30 Build/RKQ1.200826.002) AppleWebKit/537.36 (KHTML, like Gecko) Version/4.0 Chrome/89.0.4389.116 Mobile Safari/537.36 XiaoMi/MiuiBrowser/15.6.8");
uadd("Mozilla/5.0 (iPhone; U; CPU iPhone OS 5_1_1 like Mac OS X; en-us) AppleWebKit/534.46 (KHTML, like Gecko) Version/5.1 Mobile/9B206 Safari/7534.48.3 XiaoMi/MiuiBrowser/15.6.8");
uadd("Mozilla/5.0 (Linux; Android 10; HarmonyOS; ELS-AN00; HMSCore 6.2.0.302) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/92.0.4515.105 HuaweiBrowser/12.0.1.300 Mobile Safari/537.36");
uadd("Mozilla/5.0 (Linux; Android 10; HarmonyOS; ELS-AN01) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/92.0.4515.105 HuaweiBrowser/12.0.1.300 Mobile Safari/537.36");
uadd("Mozilla/5.0 (Linux; Android 11; NTH-AN00; HMSCore 6.2.0.302) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/92.0.4515.105 HuaweiBrowser/12.0.1.300 Mobile Safari/537.36");
uadd("Mozilla/5.0 (Linux; Android 10; SEA-AL10) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/92.0.4515.105 HuaweiBrowser/12.0.1.300 Mobile Safari/537.36");
uadd("Mozilla/5.0 (Linux; Android; SEA-AL10) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/70.0.3538.64 HuaweiBrowser/10.0.3.311 Mobile Safari/537.36");
uadd("Mozilla/5.0 (Linux; U; Android ; zh-cn; LON-AL00 Build/HUAWEILON-AL00) AppleWebKit/537.36 (KHTML, like Gecko) Version/4.0 Chrome/89.0.4389.72 MQQBrowser/12.1 Mobile Safari/537.36 COVC/045825");
uadd("Mozilla/5.0 (Linux; U; Android ) AppleWebKit/537.36 (KHTML, like Gecko) Version/4.0 Chrome/89.0.4389.72 MQQBrowser/12.1 Mobile Safari/537.36 COVC/045825");
uadd("Mozilla/5.0 (Linux; U; Android 9; zh-cn; KB2000 Build/RP1A.201005.001) AppleWebKit/537.36 (KHTML, like Gecko) Version/4.0 Chrome/89.0.4389.72 MQQBrowser/12.1 Mobile Safari/537.36 COVC/045825");
uadd("Mozilla/5.0 (Linux; Android 9; V1901A Build/P00610; wv) AppleWebKit/537.36 (KHTML, like Gecko) Version/4.0 Chrome/76.0.3809.89 Mobile Safari/537.36 T7/12.27 SP-engine/2.37.0 baiduboxapp/12.28.5.10 (Baidu; P1 9) NABar/1.0");
uadd("Mozilla/5.0 (iPad; U; iPad OS 5_1_1 like Mac OS X; en-us) AppleWebKit/534.46 (KHTML, like Gecko) Version/5.1 Mobile/9B206 Safari/7534.48.3 XiaoMi/MiuiBrowser/15.6.8");
uadd("Mozilla/5.0 (iPhone; CPU iPhone OS 14,4,2 like Mac OS X) AppleWebKit/534.46 (KHTML, like Gecko) Version/5.1 Mobile/9B206 Safari/7534.48.3 XiaoMi/MiuiBrowser/15.6.8");
uadd("Mozilla/5.0 (iPhone; CPU iPhone OS 11-6 like Mac OS X) AppleWebKit/534.46 (KHTML, like Gecko) Version/5.1 Mobile/9B206 Safari/7534.48.3 XiaoMi/MiuiBrowser/15.6.8");
uadd("dadashop/dadashop_version (iPhone; CPU iPhone OS 20_8_1)");
uadd("dadashop/dadashop_version (iPhone; CPU iPhone OS 19_8_1 ;u)");
uadd("dadashop/dadashop_version (iPhone; CPU iPhone OS 17_8_1;u)");
uadd("dadaShop/8.22.0 (com.dada.store; build:615; iOS 18.8.1) Alamofire/8.22.0");
}
@Test
public void iosTest(){
init();
String ppp = "[^a-zA-Z0-9 ]";
List pList = Arrays.asList("(ip[honead]+)(?:.*os\s([\w.,/\-]+)\slike|;\sopera)",
"(ip[honead]+).*os\s([\w.,/\-]+)[);]",
"(ios)\s([\w.,/\-]+)[);]",
"(ios|ip[honead]+)\s*([;)])");
for(String rp : pList){
int i=0;
Pattern r = Pattern.compile(rp,Pattern.CASE_INSENSITIVE);
for(String s: uaCase){
Matcher m = r.matcher(s);
if(m.find()){
System.out.println(i+" : "+m.group(0));
System.out.println("name : "+m.group(1));
System.out.println("version : "+m.group(2).replaceAll(ppp,".")+"n");
i++;
}
}
System.out.println("-------------------n");
}
}
@Test
public void harmonyTest(){
init();
String pattern = "linux;.*android.*(harmony).*;\s*([\w.,/\-]+)\s*[;)]";
int i=0;
Pattern r = Pattern.compile(pattern,Pattern.CASE_INSENSITIVE);
for(String s: uaCase){
Matcher m = r.matcher(s);
if(m.find()){
System.out.println(i+" : "+m.group(0));
System.out.println("name : "+m.group(1));
System.out.println("version : "+m.group(2)+"n");
i++;
}
}
}
@Test
public void androidTest(){
init();
String ppp = "[^a-zA-Z0-9 ]";
List pList = Arrays.asList("linux;.*(android)\s([\w.,/\-]+)\s*[;)]",
"linux;.*(android)\s*([;)])");
for(String rp : pList){
int i=0;
Pattern r = Pattern.compile(rp,Pattern.CASE_INSENSITIVE);
for(String s: uaCase){
Matcher m = r.matcher(s);
if(m.find()){
System.out.println(i+" : "+m.group(0));
System.out.println("name : "+m.group(1));
System.out.println("version : "+m.group(2).replaceAll(ppp,".")+"n");
i++;
}
}
System.out.println("-------------------n");
}
}
@Test
public void deviceTest(){
init();
String ppp = "[^a-zA-Z0-9 ]";
List pList = Arrays.asList(
";\s*([\w.,/\- ]+)\sbuild/",
"linux;.*android.*harmony.*;\s*([\w.,/\- ]+);.*HuaWeiBrowser",
"linux;.*android.*harmony.*;\s*([\w.,/\- ]+)[)].*HuaWeiBrowser",
"linux;.*android.*;\s*([\w.,/\- ]+);.*HuaWeiBrowser",
"linux;.*android.*;\s*([\w.,/\- ]+)[)].*HuaWeiBrowser",
".*android.*;\s*([\w.,/\- ]+); wv[)].*VivoBrowser"
);
for(String rp : pList){
int i=0;
Pattern r = Pattern.compile(rp,Pattern.CASE_INSENSITIVE);
for(String s: uaCase){
Matcher m = r.matcher(s);
if(m.find()){
System.out.println(i+" : "+m.group(0));
System.out.println("name : "+m.group(1)+"n");
i++;
}
}
System.out.println("-------------------n");
}
}
}
然后是systemInfoStrategy的json配置
[
{
"regularList": [
"(ip[honead]+)(?:.*os\s([\w.,/\-]+)\slike|;\sopera)",
"(ip[honead]+).*os\s([\w.,/\-]+)[);]",
"(ios)\s([\w.,/\-]+)\s*[);]"
],
"funList": [
{
"name": "osType",
"value": "ios"
},
{
"name": "osVersion",
"r1": "[^a-zA-Z0-9 ]",
"r2": "."
}
]
},
{
"regularList": [
"(ios|ip[honead]+)\s*([;)])"
],
"funList": [
{
"name": "osType",
"value": "ios"
},
{
"name": "osVersion",
"value": "unknown"
}
]
},
{
"regularList": [
"linux;.*android.*(harmony).*;\s*([\w.,/\-]+)\s*[;)]"
],
"funList": [
{
"name": "osType",
"value": "HarmonyOS"
},
{
"name": "osVersion",
"valueFlag": true
}
]
},
{
"regularList": [
"linux;.*(android)\s([\w.,/\-]+)\s*[;)]"
],
"funList": [
{
"name": "osType",
"value": "android"
},
{
"name": "osVersion",
"r1": "[^a-zA-Z0-9 ]",
"r2": "."
}
]
},
{
"regularList": [
"linux;.*(android)\s*([;)])"
],
"funList": [
{
"name": "osType",
"value": "android"
},
{
"name": "osVersion",
"value": "unknown"
}
]
}
]
有任何问题可以在评论区讨论



