I have a Voice Dictation command "Tag Photo" working with an AppleScript that sets the "Photo" tag on selected filesystem objects. The AppleScript is written so you only need to make one change with each script and that is to change the TAG_NAME properties tag string. If a filesystem object already has a tag name on it, the script will add your tag name to that file or folder.
Here is the AppleScript that you copy/paste into Script Editor (Launchpad > Other > Script Editor). You then click the compile button, and save it twice: 1) as a text format tagname_photo.applescript in your preferred location, and then with option+Save As as a script format tagname_photo.scpt in /Users/yourname/Library/Scripts/Applications/Finder folder. The latter is where you will tell your new Voice Control command to find the workflow…
The AppleScript:
(*
AppleScript that checks selected filesystem obect(s) for existing tags
and adds the current tag name defined in the property beginning with TAG_
This script does not gather all defined tag names and attempt to validate
the user supplied tag name.
The user changes *one* location <== EDIT ==> for different tag name strings
Reference: https://discussions.apple.com/thread/252807460
Tested: macOS 11.4
VikingOSX, 2021-05-29, Apple Support Communities, No warranties expressed/implied
*)
use AppleScript version "2.4" -- Yosemite (10.10) or later
use framework "Cocoa"
use scripting additions
property NSUTF8StringEncoding : a reference to 4
property NSString : a reference to current application's NSString
property NSWorkspace : a reference to current application's NSWorkspace
property NSURL : a reference to current application's NSURL
property NSArray : a reference to current application's NSArray
property NSURLTagNamesKey : a reference to current application's NSURLTagNamesKey
property NSDictionary : a reference to current application's NSDictionary
property TAG_NAME : {"Photo"} # <== EDIT ==>
# assumption is that the filesystem object(s) are already selected before this
# script is invoked by Dictation.
set tagFileList to {}
# process selected items
tell application "Finder"
if selection is {} then return
set selList to selection
repeat with anItem in selList
copy POSIX path of (anItem as text) to the end of tagFileList
end repeat
end tell
repeat with anItem in tagFileList
my set_tag(anItem, TAG_NAME)
end repeat
return
on set_tag(theFile, atag)
-- atag is always a list, even if single tag element
-- place single, or multiple tags from atag on the specified file
set existingTags to {}
set tagArray to (NSArray's arrayWithArray:atag)'s mutableCopy()
set existingTags to my file_tags(theFile)
if (count of existingTags) ≥ 1 then
# add the detected tag(s) to the existing TAG_NAME list item
tagArray's addObjectsFromArray:existingTags
end if
set fileURL to NSURL's fileURLWithPath:theFile
fileURL's setResourceValue:tagArray forKey:NSURLTagNamesKey |error|:(missing value)
return
end set_tag
on file_tags(apath)
-- Get all tags from the file, or an empty list if none
-- apath = POSIX path to file
set metadata to NSDictionary's dictionary()
set aurl to NSURL's fileURLWithPath:apath
set metadata to aurl's resourceValuesForKeys:{NSURLTagNamesKey} |error|:(missing value)
-- file has no tags yet
if (metadata's objectForKey:{NSURLTagNamesKey}) is missing value then return {""} as list
-- file has tags, so return them as a list
return (metadata's objectForKey:NSURLTagNamesKey) as list
end file_tags
In Accessibility > Voice Control > Commands… create your new command with the [+] button, and configure as follows. For testing purposes, I did not save the script as tagname_photo.scpt, but you should. And each variation of that script to match the voice command tag name should follow that format of the tag name in the filename.

Make certain that your file(s) are selected before speaking your voice command to trigger the script.