We have run into an annoying situation: A hardware-dependent limit of user groups on a Palo Alto Next-Generation Firewall. That is: We cannot use more Active Directory groups at our firewalls. The weird thing about this: We don’t need that many synced groups on our Palo, but we have to do it that way since we are using nested groups for our users. That is: Palo Alto does not support nested groups out of the box, but needs all intermediary groups to retrieve the users which results in a big number of unnecessary groups.
This is our main problem on a PA-5220: “User Group count of xyz exceeds threshold of 10000”:
Same on a PA-820, where the limit is 1000:
How we are using AD groups
TL;DR: Our users are member of container groups, which are member of different permission groups, e.g., “firewall” groups. We are using *only* such firewall groups in our Palo Alto policies, while users are only in the container groups.
A good article describing those nested groups permissions is here: Active Directory nesting groups strategy and implementation.
The Palo Problem
On a Palo, the user groups are synced from the Active Directory (LDAP profile) within Device -> User Identification -> Group Mapping Settings. The “Search Filter” limits the groups. In our case, we would *only* need our firewall groups. The “Group Member” attribute is set to “member” by default:
Using this setup, we are NOT retrieving users within our groups at all. :( In order to retrieve the users, we have to include the full path from our user groups to our firewall groups. (PAN: Nested User Groups in User-ID) In our case, this is:
Doing it that way, we now have all users listed within our firewall groups. Good so far. BUT:
The key question is: Why is Palo Alto, read: the LDAP implementation, not able to crawl the users within nested groups without having to retrieve all intermediary groups?!?
(Not-working-) Ideas
We googled a lot, opened a support ticket and had several discussions with colleagues and Palo SEs. Along with packet captures and Wireshark deep dives to understand how the LDAP queries are working. ;) Here are a couple of ideas that are all not feasible for us:
- Using a special LDAP search string for the group member attribute: “1.2.840.113556.1.4.1941“, LDAP_MATCHING_RULE_IN_CHAIN. We tried different settings such as “member:1.2.840.113556.1.4.1941:” or “(member:1.2.840.113556.1.4.1941:={0})”. Though we were able commit, no users came in.
- Using the msds-memberTransitive attribute (link) for the group member. In theory a very good idea: Using this attribute for querying a group, the LDAP server returns all users that reside in subjacent groups. Nice! Indeed, this solved our issue. If Palo Alto didn’t have a bug right there. :( In fact, for every 11th group refresh, the Palo does a full sync which failed using this attribute. For some complex reasons. PAN engineering confirmed this erroneous behaviour, however, “this cannot be fixed”. Every 11th time the “msds-memberTransitive” is used for an LDAP search request but without specifying a base object. Hence the AD is not able to answer this extensive search and replies with an error: “00002120: SvcErr: DSID-03120451, problem 5012 (DIR_ERROR), data 592062”. Here’s how we used it:
- Manually using the “Group Include List“. Sure, googling for our problem this is the first article which shows how to use the include list. In my opinion, this is ridiculous. We are using a couple of VSYSes with some more AD domains, having new (or obsolete) groups every now and then. Having to specify the relevant groups manually would be a major step back. Please note: We are using a Next-Generation firewall here!
- Rebuilding our AD into another design without using nested groups. Well. Yeah. Thanks for the idea.
In the end, the support from PAN told us to ask for a feature request. Hm. Obviously, they are not accepting it as a bug, though engineering confirmed it…
Are we the only ones running into this?
Appendix: Used CLI commands
show user group-mapping statistics show user group list | match prod_fw_example1 show user group name "abraham\prod_fw_example1" show user user-ids match-user weberjoh debug user-id reset group-mapping all debug user-id on debug debug user-id set ldap basic debug user-id off tcpdump snaplen 0 filter "host 198.51.100.42 and port 389" scp export mgmt-pcap from mgmt.pcap to weberjoh@lx.weberlab.de:.Photo by Jonathan Ford on Unsplash.